[BUG] Sambanova, Groq and Cerebras models not found after update to 1.9.1 #655

faev999 · 2025-02-14T16:34:33Z

Describe the bug
Sambanova, Groq and Cerebras models not found after update to 1.9.1. Running an agent (whether Code or ToolCalling) using models from said companies ends up in an error.

Code to reproduce the error

from smolagents import CodeAgent, LiteLLMModel, ToolCallingAgent

model_agent_a = LiteLLMModel(
    model_id="cerebras/llama-3.3-70b", # same for groq and sambanova
    api_key=os.getenv("CEREBRAS_API_KEY"),
)

agent_a = ToolCallingAgent(
    tools=[],
    model=model_agent_a,
)

agent_a.run(task="hello")

Error logs (if any)

Error in generating tool call with model:
This model isn't mapped yet. model=cerebras/llama-3.3-70b, custom_llm_provider=cerebras. Add it here - https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json.

Expected behavior
In verion 1.8.1 this was working normally with a smol tweak mentioned in here

Packages version:
1.9.1

The text was updated successfully, but these errors were encountered:

aymeric-roucher · 2025-02-14T18:07:44Z

This issues seems to be on litellm side. Could you share your litellm version?

faev999 · 2025-02-14T18:29:53Z

The version I am using for litellm is 1.60.8.
The error is being generated by this line
model_info: dict = litellm.get_model_info(self.model_id)
the problem with get_model_info() from litellm, is that it depends on this file which might not be kept up-to-date.
Also, I found that although some of the models I was testing (e.g groq/llama-3.3-70b-specdec) do exist in that file, I was still getting the same error. I found that if api_base is declared e.g api_base="https://api.groq.com/openai/v1/" then it works.

aymeric-roucher · 2025-02-14T18:37:45Z

Possible fixes:

Revert litellm to an earlier version
Use OpenAIServerModel instead
Tell me if none works!

faev999 · 2025-02-14T18:56:18Z

I just briefly checked the litellm repository, and it seems there are no major bugs regarding the get_model_info() function. The "problem" resides in that some models simply don't exist in their model declaration file that I mentioned previously. One could create a request so they add the desired model to the file, or also one could create a PR in order to add the model.

"Use OpenAIServerModel instead" could you clarify how to do this?

For now I modified that line in 1.9.1 so it looks more like 1.8.1

@cached_property
def _flatten_messages_as_text(self):
        return self.model_id.startswith(("ollama", "groq", "gemini", "openrouter", "cerebras", "sambanova")),

aymeric-roucher · 2025-02-14T20:08:57Z

Oh I see what you mean! Then yes, reverting litellm version won't work and your fix might work better. Why do you add gemini, openrouter, cerebras & sambanova to the list? When you confirm that, I'll make a final list of providers with whom to use that list and make a patch release.

Re: OpenAIServerModel: just use this as your model calss, and point the api_base to your inference provider's server, like

model = OpenAIServerModel(
    model_id="...",
    api_base="http://...",
    api_key="...",
)

aymeric-roucher · 2025-02-15T11:16:20Z

I've just done release 1.9.2 with a fix, please tell me if it still does not work!

sysradium · 2025-02-15T11:28:29Z

@aymeric-roucher

Since we switched to startswith detection (which breaks vision over ollama) of flattening maybe it makes sense to proactively check all custom llm providers here and add all of them 🤔
https://github.com/BerriAI/litellm/blob/0ffd99afff449b9e79c5f5316bf14fe011eb98df/litellm/litellm_core_utils/get_llm_provider_logic.py#L388

I thought get_model_info is still called internally inside litellm at some point, so it is hard to avoid it.

With message flattening one way to for it for all models is maybe to use:

flatten_messages_as_text = False                                        
try:                                                                    
    litellm.litellm_core_utils.prompt_templates.factory.prompt_factory( 
        self.model_id,                                                  
        [                                                               
            {"content": {"text": "sample"}},                            
        ],                                                              
        custom_llm_provider=self.model_id.split("/")[0],                
    )                                                                   
except Exception:
   flatten_messages_as_text = True

It kind of seems to work, but looks a bit awkward.

I am just not liking that this way we ditched ollama vision models ...

faev999 · 2025-02-15T13:12:13Z

Oh I see what you mean! Then yes, reverting litellm version won't work and your fix might work better. Why do you add gemini, openrouter, cerebras & sambanova to the list? When you confirm that, I'll make a final list of providers with whom to use that list and make a patch release.

Re: OpenAIServerModel: just use this as your model calss, and point the api_base to your inference provider's server, like

model = OpenAIServerModel(
model_id="...",
api_base="http://...",
api_key="...",
)

I have found that if I don't flatten the message when using those providers, then tool usage is no longer reliable, especially with ToolCallingAgent as the manager agent: e.g gemini 2 models simply refuse to use tools or don't even know they exist. However this was just during some small tests I did. I have been using llama 3.3 70b and qwen 2.5 32b with cerebras and groq with very good results using the modifications that I mentioned.

"Re: OpenAIServerModel: just use this as your model calss, and point the api_base to your inference provider's server, like..."

I will try that next, thank you.

faev999 · 2025-02-15T15:03:10Z

@aymeric-roucher

Since we switched to startswith detection (which breaks vision over ollama) of flattening maybe it makes sense to proactively check all custom llm providers here and add all of them 🤔 https://github.com/BerriAI/litellm/blob/0ffd99afff449b9e79c5f5316bf14fe011eb98df/litellm/litellm_core_utils/get_llm_provider_logic.py#L388

I thought get_model_info is still called internally inside litellm at some point, so it is hard to avoid it.

With message flattening one way to for it for all models is maybe to use:

flatten_messages_as_text = False
try:
litellm.litellm_core_utils.prompt_templates.factory.prompt_factory(
self.model_id,
[
{"content": {"text": "sample"}},
],
custom_llm_provider=self.model_id.split("/")[0],
)
except Exception:
flatten_messages_as_text = True
It kind of seems to work, but looks a bit awkward.

I am just not liking that this way we ditched ollama vision models ...

In 1.9.1 get_model_info() is using the whole model name just to extract the provider and then confirm if the provider is Ollama, right?, no other fields of the model_info are being used?.

model_info: dict = litellm.get_model_info(self.model_id)
       if model_info["litellm_provider"] == "ollama":

I think that I found 3 different ways to get the provider without depending on the existence of the model name.

import litellm

provider = litellm.get_llm_provider(model="cerebras/llama-3.3-70b")

print(provider)

This prints: ('llama-3.3-70b', 'cerebras', ' value of CEREBRAS_API_KEY', 'https://api.cerebras.ai/v1'). this function will raise an exception if the provider name doesn't exist, e.g model="cerererebras/llama-3.3-70b.

Another easy way to verify is the provider exist in that file is:

from litellm.types.utils import LlmProvidersSet, LlmProviders

is_provider = "ollama" in LlmProvidersSet

Another way, is to use the file itself which is instantiated as variable litellm.model_cost so one could maybe split the provider from the model name i.e. "cerebras" from "cerebras/llama-3.3-70b" and then something like if 'cerebras' in litellm.model_cost. one can also use the function litellm.litellm_core_utils.get_model_cost_map(url) that deserializes the contents of the remote json file into an object and that can be used to compare.

So an approach to keep ollama vision models would be:

@cached_property
   def _flatten_messages_as_text(self):
       import litellm
       provider = litellm.get_llm_provider(model=self.model_id)
       if provider and provider[1] == "ollama":
           model_info: dict = litellm.get_model_info(self.model_id)
           return model_info["key"] != "llava"
       elif provider and provider[1] in["...",....] #selected models for flattening
           return true

       return False

sysradium · 2025-02-15T16:04:55Z

@faev999 I was thinking that I can push LiteLLM devs into including more information into get_model_info so that we could make more informed decisions. Like what modes the model supports, etc.

Currently what we want know is to understand if the model supports non-flat messages or not. I wish they could return that as well. From how I read their code, if they have some prompt transformation for the model, then it expects some kind of non-flat message, otherwise flat. Hence my solution above. It looks ugly and maybe relies to much on the internals :/

I am not super happy hardcoding model/provider names, since that's a never-ending game.

nikolaidk · 2025-02-16T20:26:07Z

Maybee model validation should not be so strict or one should have an option to skip model validation. Using LiteLLM there is no problem using a model that is not listed - beta models. And one can add necessary configuration in code to use models not listed. For prototyping it would be nice if this litellm implementation worked more like litellm's original implementation. If we could just pass a real litellm model reference.

faev999 added the bug Something isn't working label Feb 14, 2025

faev999 changed the title ~~[BUG]~~ [BUG] Sambanova, Groq and Cerebras models not found after update to 1.9.1 Feb 14, 2025

sysradium mentioned this issue Feb 15, 2025

[BUG] helium vision_web_browser.py NoneType error after saving image #570

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Sambanova, Groq and Cerebras models not found after update to 1.9.1 #655

[BUG] Sambanova, Groq and Cerebras models not found after update to 1.9.1 #655

faev999 commented Feb 14, 2025 •

edited

Loading

aymeric-roucher commented Feb 14, 2025

faev999 commented Feb 14, 2025

aymeric-roucher commented Feb 14, 2025

faev999 commented Feb 14, 2025

aymeric-roucher commented Feb 14, 2025 •

edited

Loading

aymeric-roucher commented Feb 15, 2025

sysradium commented Feb 15, 2025 •

edited

Loading

faev999 commented Feb 15, 2025 •

edited

Loading

faev999 commented Feb 15, 2025 •

edited

Loading

sysradium commented Feb 15, 2025 •

edited

Loading

nikolaidk commented Feb 16, 2025

[BUG] Sambanova, Groq and Cerebras models not found after update to 1.9.1 #655

[BUG] Sambanova, Groq and Cerebras models not found after update to 1.9.1 #655

Comments

faev999 commented Feb 14, 2025 • edited Loading

aymeric-roucher commented Feb 14, 2025

faev999 commented Feb 14, 2025

aymeric-roucher commented Feb 14, 2025

faev999 commented Feb 14, 2025

aymeric-roucher commented Feb 14, 2025 • edited Loading

aymeric-roucher commented Feb 15, 2025

sysradium commented Feb 15, 2025 • edited Loading

faev999 commented Feb 15, 2025 • edited Loading

faev999 commented Feb 15, 2025 • edited Loading

sysradium commented Feb 15, 2025 • edited Loading

nikolaidk commented Feb 16, 2025

faev999 commented Feb 14, 2025 •

edited

Loading

aymeric-roucher commented Feb 14, 2025 •

edited

Loading

sysradium commented Feb 15, 2025 •

edited

Loading

faev999 commented Feb 15, 2025 •

edited

Loading

faev999 commented Feb 15, 2025 •

edited

Loading

sysradium commented Feb 15, 2025 •

edited

Loading