Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Clarification Needed for Using trim_messages with Node.js/TypeScript #8470

Open
suysoftware opened this issue Feb 11, 2025 · 8 comments
Assignees
Labels
enhancement New feature or request

Comments

@suysoftware
Copy link

The Feature

We are currently integrating LiteLLM into our Node.js backend using TypeScript, and we are issuing curl requests directly. However, the documentation for the trim_messages (trimming input messages) feature only demonstrates its usage with the Python SDK. This leaves us unsure about how to leverage this feature when using direct HTTP requests from a Node.js environment.

Could you please provide guidance or documentation on how to use trim_messages (or its equivalent) in our setup?

Motivation, pitch

Any examples or instructions for handling token trimming in a Node.js/TypeScript context would be greatly appreciated.

Are you a ML Ops Team?

No

Twitter / LinkedIn details

https://x.com/sezerufukyavuz

@suysoftware suysoftware added the enhancement New feature or request label Feb 11, 2025
@ishaan-jaff
Copy link
Contributor

cc @krrishdholakia , I believe you got a request around this recently. Might be relevant

@krrishdholakia
Copy link
Contributor

Hey @suysoftware this isn't currently available on proxy - how would you expect this to work?

@suysoftware
Copy link
Author

Thanks for the quick response. For our use case, it would be really helpful if we could enable token trimming via a parameter in the request body. For example, if there were an option like "trim_messages": true (or similar) in the request payload, it would streamline our integration without having to implement the trimming on our end.

Thanks again for considering this functionality!

@krrishdholakia
Copy link
Contributor

sounds good

@yigitkonur
Copy link
Contributor

that would be super useful!

@PaperBoardOfficial
Copy link

Note: I am working on a typescript project, so our only hope is litellm proxy as the unoffical litellm js library is not maintained well. so, my views below are only for the proxy.

There are some common problems when integrating other llm models:

  • some models don't support temperature or other params. you guys have handled that really nice using drop_params.
  • some models don't support json_schema structured output. is there a way to get fallback.
  • sometimes the message breaches the context window size. we need a truncation strategy or compression strategy (based on summarization).

It would be awesome if these could be provided in the proxy config.yaml file.

@krrishdholakia
Copy link
Contributor

some models don't support json_schema structured output. is there a way to get fallback.

Hey @PaperBoardOfficial what would you expect to happen here? (trying to see if we cover this already) e.g. https://docs.litellm.ai/docs/routing#pre-call-checks-context-window-eu-regions

@krrishdholakia krrishdholakia self-assigned this Mar 3, 2025
@PaperBoardOfficial
Copy link

No, I was not referring to fallback routing based on context window length check. I was just saying that lets say I am using a 4k model, and I pass a message greater than 4k, either I should be able to break down the message into chunks and then send it to the llm or maybe summarize the chunks and then send it to the llm [This is how CrewAI solves this issue: They have a hashmap for context window length for different models (link). So, if the message length is greater than the context window they call the llm to summarize the text in chunks (link)].
maybe trimming can also work as mentioned by original poster of this issue. This is a common issue. I don't want to re-invent the wheel and maintain context window length of each model. If a readymade way to handle context window length comes in Litellm, it would be awesome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants