-
-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Metrics time_to_first_token_seconds, time_per_output_token_seconds not working correctly #6337
Comments
@thies1006 Consider increasing the prompt length. You may see a difference in the results |
@AllenDou Since v0.5.x, this same symptom occurs. |
Could you show your model(public model is better) & test data & method(chat or completion)? |
duplicated #6507 |
just checked, metrics are now fine for me in v0.5.4 (unrelated: need to use --disable-frontend-multiprocessing in this version) |
Your current environment
vllm==0.5.1
🐛 Describe the bug
The entries of the histograms
are identical for all buckets, so the time values are apparently always 0 seconds.
The text was updated successfully, but these errors were encountered: