-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
can we implement rate limiting? #39
Comments
Ask Gemini to write a rate limit handler for you :) |
lol ... it would be rate limited :] |
|
man, honestly. it's like I will answer any of your question with the same answer.. just generate it |
This is pretty relevant, and the responses quite annoying. Please re-open it @batrlatom, unless you've found a way. |
@luandro |
I'm quite familiar with those LLM coding tools, but where to prompt for this change? What works and what doesn't? They won't substitute knowing working examples. By sharing what works we can improve together, and other people who will face this issue in the future will have a quality reference for how to deal with it. |
Thanks @luandro. I switched to qwen-2.5 coder model since it gives me a little better results than Gemini in my case. |
btw, I am not experiencing the problem again for now. But if we really want to add limits to prevent bloating gemini too much, the solution could be as simple as:
|
@batrlatom and others, if there's interest: don't hesitate to open a PR with a rate limiter added to the base |
Hi, I have a problem using gemini model via litellm. I am getting rate limited very frequently. What about to add some waiting time between calls so it does not happen?
The text was updated successfully, but these errors were encountered: