-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Sambanova, Groq and Cerebras models not found after update to 1.9.1 #655
Comments
This issues seems to be on litellm side. Could you share your litellm version? |
The version I am using for litellm is 1.60.8. |
Possible fixes:
|
I just briefly checked the litellm repository, and it seems there are no major bugs regarding the "Use OpenAIServerModel instead" could you clarify how to do this? For now I modified that line in 1.9.1 so it looks more like 1.8.1
|
Oh I see what you mean! Then yes, reverting litellm version won't work and your fix might work better. Why do you add gemini, openrouter, cerebras & sambanova to the list? When you confirm that, I'll make a final list of providers with whom to use that list and make a patch release. Re: OpenAIServerModel: just use this as your model calss, and point the api_base to your inference provider's server, like model = OpenAIServerModel(
model_id="...",
api_base="http://...",
api_key="...",
) |
I've just done release |
Since we switched to I thought With message flattening one way to for it for all models is maybe to use: flatten_messages_as_text = False
try:
litellm.litellm_core_utils.prompt_templates.factory.prompt_factory(
self.model_id,
[
{"content": {"text": "sample"}},
],
custom_llm_provider=self.model_id.split("/")[0],
)
except Exception:
flatten_messages_as_text = True It kind of seems to work, but looks a bit awkward. I am just not liking that this way we ditched ollama vision models ... |
I have found that if I don't flatten the message when using those providers, then tool usage is no longer reliable, especially with ToolCallingAgent as the manager agent: e.g gemini 2 models simply refuse to use tools or don't even know they exist. However this was just during some small tests I did. I have been using llama 3.3 70b and qwen 2.5 32b with cerebras and groq with very good results using the modifications that I mentioned. "Re: OpenAIServerModel: just use this as your model calss, and point the api_base to your inference provider's server, like..." I will try that next, thank you. |
In 1.9.1 get_model_info() is using the whole model name just to extract the provider and then confirm if the provider is Ollama, right?, no other fields of the model_info are being used?.
I think that I found 3 different ways to get the provider without depending on the existence of the model name.
This prints: Another easy way to verify is the provider exist in that file is:
Another way, is to use the file itself which is instantiated as variable So an approach to keep ollama vision models would be:
|
@faev999 I was thinking that I can push LiteLLM devs into including more information into Currently what we want know is to understand if the model supports non-flat messages or not. I wish they could return that as well. From how I read their code, if they have some prompt transformation for the model, then it expects some kind of non-flat message, otherwise flat. Hence my solution above. It looks ugly and maybe relies to much on the internals :/ I am not super happy hardcoding model/provider names, since that's a never-ending game. |
Maybee model validation should not be so strict or one should have an option to skip model validation. Using LiteLLM there is no problem using a model that is not listed - beta models. And one can add necessary configuration in code to use models not listed. For prototyping it would be nice if this litellm implementation worked more like litellm's original implementation. If we could just pass a real litellm model reference. |
Describe the bug
Sambanova, Groq and Cerebras models not found after update to 1.9.1. Running an agent (whether Code or ToolCalling) using models from said companies ends up in an error.
Code to reproduce the error
Error logs (if any)
Expected behavior
In verion 1.8.1 this was working normally with a smol tweak mentioned in here
Packages version:
1.9.1
The text was updated successfully, but these errors were encountered: