Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Support other LLM providers #62

Open
PaperBoardOfficial opened this issue Feb 28, 2025 · 5 comments
Open

[Feature Request] Support other LLM providers #62

PaperBoardOfficial opened this issue Feb 28, 2025 · 5 comments

Comments

@PaperBoardOfficial
Copy link

Hey @sfaist
I'd like to add multi-LLM support to Superglue using LiteLLM. Some users might prefer Gemini, Claude, or other models instead of OpenAI, and this would let them easily switch.

LiteLLM is perfect for this since it's a drop-in replacement for the OpenAI SDK - minimal code changes needed. Users would just change an env var to switch providers.

This would be my implementation plan:

  • Add LiteLLM as a dependency
  • Create a thin wrapper class to standardize interactions
  • Update existing OpenAI calls to use the wrapper
  • Add environment variables for configuration
  • Update documentation

If this repository is open to external contributions then I'll like to submit a PR.

Thanks!

@sfaist
Copy link
Contributor

sfaist commented Feb 28, 2025

Sounds great, thanks for raising this. This has been on our roadmap for a while and we would appreciate any external contributions on this. LiteLLM seems like a good choice. Our current llm setup is a bit of a mess anyways.
3 considerations:

  • imo best would be to create an llm class that handles these
  • we need to validate that the model supports structured outputs (json_schema, this is mandatory and reduces the number of available models). If not, we need to find another way to validate whether the output fits the schema.
  • we need to validate whether the model supports temperature (o-3 does not, this is ok but we should not send it then).

If you can take this on, please consider the above.

@PaperBoardOfficial
Copy link
Author

Awesome, thanks for the go-ahead! I'll build that LLM class with LiteLLM and make sure it handles the JSON schema and temperature differences between models.

@PaperBoardOfficial
Copy link
Author

@sfaist I was thinking of using litellm's javascript library (litellm is a python project). But, the community based library is poorly maintained. So, there are two options:

  1. use proxy server aka llm gateway. for this, we have litellm, portkey etc.
  2. the other option is to use token-js/token.js which is not a proxy server, it is a typescript sdk. But, I doubt it will be maintained in the long run.

I am thinking of using litellm proxy server. We can route the requests from superglue to it and it will route the request to the respective llm. Superglue already has docker compose. So, deployment/setup can be done easily using docker.

Should I go ahead with it?

@stefanfaistenauer
Copy link
Contributor

Since we are already a proxy, adding another proxy seems a bit much, particularly since we want to keep the setup lightweight. I have double-checked with some very experienced folks and they recommend the Vercel AI SDK. What do you think about that?

@PaperBoardOfficial
Copy link
Author

I get the concern about adding another proxy layer, but I think LiteLLM offers some real advantages that outweigh that:
The biggest benefit is that it gives us one consistent way to call any LLM provider using the familiar OpenAI syntax. With Vercel AI SDK, we'd need different code patterns for each provider. I checked there doc, we will have to maintain switch case for each model. But for litellm, it is as easy as writing the API_KEY in the .env file.
Also, by setting drop_params=true in the litellm-config.yaml file, Litellm takes care of params that aren't supported by the model like temperature.
A great example: I just added GEMINI_API_KEY to the .env file and immediately could use Google's free tier Gemini model with the exact same code. This would be super helpful for new contributors who might not have paid API keys but can use free models like Gemini without changing any code.
Regarding the json_schema, I checked there code (link), There are 146 models that support json_schema as they have "supports_response_schema " set to true.
Regarding the context window length issue, we can raise another issue for that. I have even asked the litellm repo owners (link) to provide truncation in proxy server because truncation is present currently only in the python sdk.

What do you think about sticking with the LiteLLM implementation I've built? I believe it'll make our code cleaner and easier to maintain as we add more LLM providers down the road.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants