must handle rate limits from open ai #9

hookla · 2023-10-21T21:48:19Z

openai.error.RateLimitError: Rate limit reached for gpt-4 in organization org-lDdTak03uNZ02kmY5m6ginja on tokens per min. Limit: 10000 / min. Please try again in 6ms. Contact us through our help center at help.openai.com if you continue to have issues.

so wait and retry when we see this...

lynxrv21 · 2023-11-07T19:04:43Z

Added retry logic with decorator in gpt_client
Although, need to decrease delay to 0.1 second - should be enough for 10k/min rate.

Alternatively, I can add 0.1-second delay before every request - but that would be less elegant
Or set up a timer and counter for requests, reset the counter every second - that feels like an overshoot

ishaan-jaff · 2023-11-23T01:11:10Z

@lynxrv21 @hookla I'm the maintainer of LiteLLM - I believe we can help with this problem - I'd love your feedback if LiteLLM is missing something

Here's the quick start:
docs: https://docs.litellm.ai/docs/routing

from litellm import Router

model_list = [{ # list of model deployments 
    "model_name": "gpt-3.5-turbo", # model alias 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "azure/chatgpt-v-2", # actual model name
        "api_key": os.getenv("AZURE_API_KEY"),
        "api_version": os.getenv("AZURE_API_VERSION"),
        "api_base": os.getenv("AZURE_API_BASE")
    }
}, {
    "model_name": "gpt-3.5-turbo", 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "azure/chatgpt-functioncalling", 
        "api_key": os.getenv("AZURE_API_KEY"),
        "api_version": os.getenv("AZURE_API_VERSION"),
        "api_base": os.getenv("AZURE_API_BASE")
    }
}, {
    "model_name": "gpt-3.5-turbo", 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "gpt-3.5-turbo", 
        "api_key": os.getenv("OPENAI_API_KEY"),
    }
}]

router = Router(model_list=model_list)

# openai.ChatCompletion.create replacement
response = await router.acompletion(model="gpt-3.5-turbo", 
                messages=[{"role": "user", "content": "Hey, how's it going?"}])

print(response)

hookla assigned lynxrv21 Oct 21, 2023

hookla changed the title ~~must handle rate limites from open ai~~ must handle rate limits from open ai Nov 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

must handle rate limits from open ai #9

must handle rate limits from open ai #9

hookla commented Oct 21, 2023

lynxrv21 commented Nov 7, 2023

ishaan-jaff commented Nov 23, 2023

must handle rate limits from open ai #9

must handle rate limits from open ai #9

Comments

hookla commented Oct 21, 2023

lynxrv21 commented Nov 7, 2023

ishaan-jaff commented Nov 23, 2023