-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feat] litellm.acompletion() make Langfuse success handler non blocking #1519
[Feat] litellm.acompletion() make Langfuse success handler non blocking #1519
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
This PR is being deployed to Railway 🚅 litellm: ◻️ REMOVED |
litellm/utils.py
Outdated
threading.Thread( | ||
target=logging_obj.success_handler, args=(result, start_time, end_time) | ||
).start() | ||
threading.Thread( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ishaan-jaff how does switching create task for threads solve the issue?
i'm also concerned about creating too many threads here, which would cause issues in high-traffic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@krrishdholakia I update the conversation of this PR with notes
Problem
litellm.acompletion()
callsWhy was this caused ?
langfuse.flush()
in the callback, this is a blocking I/O Operation https://langfuse.com/docs/sdk/python#shutdown-behaviorasyncio.create_task
, we NEED to use threads for langfuse logging, sincelangfuse.flush()
is blockingThe async callbacks, (langfuse in this case) was not truly async, which was blocking execution
Solution
Move back to using Threads for running langfuse logging, instead of using
asyncio.create_task
How to Prevent this in the future
Added a test to make 5
litellm.acompletion()
calls to langfuse logger, and 5 to non langfuse. Asserting the delta is less than 1 secondNotes:
asyncio.create_task(coro)
is used for scheduling coroutines for execution on the event loop. Langfuse logging was not truly async, due to which execution would get blockedThis PR also Adds support for tagging
cache_hits
on Langfuse(note you need langfuse>=2.6.3