Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] litellm.acompletion() make Langfuse success handler non blocking #1519

Merged
merged 9 commits into from
Jan 19, 2024

Conversation

ishaan-jaff
Copy link
Contributor

@ishaan-jaff ishaan-jaff commented Jan 19, 2024

Problem

  • The Langfuse Success Callback was blocking running litellm.acompletion() calls

Why was this caused ?

  • Langfuse uses langfuse.flush() in the callback, this is a blocking I/O Operation https://langfuse.com/docs/sdk/python#shutdown-behavior
  • we were running langfuse logging using asyncio.create_task, we NEED to use threads for langfuse logging, since langfuse.flush() is blocking

The async callbacks, (langfuse in this case) was not truly async, which was blocking execution

Solution

Move back to using Threads for running langfuse logging, instead of using asyncio.create_task

How to Prevent this in the future

Added a test to make 5 litellm.acompletion() calls to langfuse logger, and 5 to non langfuse. Asserting the delta is less than 1 second

Notes:

asyncio.create_task(coro) is used for scheduling coroutines for execution on the event loop. Langfuse logging was not truly async, due to which execution would get blocked

This PR also Adds support for tagging cache_hits on Langfuse

(note you need langfuse>=2.6.3

Screenshot 2024-01-19 at 11 36 47 AM

Copy link

vercel bot commented Jan 19, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 19, 2024 7:40pm

Copy link

railway-app bot commented Jan 19, 2024

This PR is being deployed to Railway 🚅

litellm: ◻️ REMOVED

litellm/utils.py Outdated
threading.Thread(
target=logging_obj.success_handler, args=(result, start_time, end_time)
).start()
threading.Thread(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ishaan-jaff how does switching create task for threads solve the issue?

i'm also concerned about creating too many threads here, which would cause issues in high-traffic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@krrishdholakia I update the conversation of this PR with notes

@ishaan-jaff ishaan-jaff changed the title [Feat] Litellm make success handler non blocking [Feat] Litellm make Langfuse success handler non blocking Jan 19, 2024
@ishaan-jaff ishaan-jaff changed the title [Feat] Litellm make Langfuse success handler non blocking [Feat] litellm.acompletion() make Langfuse success handler non blocking Jan 19, 2024
@ishaan-jaff ishaan-jaff merged commit 6500360 into main Jan 19, 2024
3 of 7 checks passed
@ishaan-jaff ishaan-jaff deleted the litellm_proxy_make_success_handler_non_blocking branch January 19, 2024 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants