-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Serve] Decrement ray_serve_deployment_queued_queries
when client disconnects
#37965
[Serve] Decrement ray_serve_deployment_queued_queries
when client disconnects
#37965
Conversation
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
@shrekris-anyscale is this needed in addition to #37939 ? |
Yes, that PR is tackling a different problem: the queue metric is not getting set when there's no traffic. This change is fixing the issue where queued requests that get disconnected aren't counted properly. As a sanity check, the unit test in this PR fails locally with just the changes in #37939. |
actually, i don't think my pr will fix the issue. we decrease the metrics and set it whenever the request is finished. We can proceed to checkin this pr, and close mine. |
@edoakes @sihanwang41 this PR is ready for review. |
python/ray/serve/_private/router.py
Outdated
|
||
return result | ||
return result | ||
except asyncio.CancelledError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be an issue if different exceptions happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it would. I changed the code to a try-finally
block, so the metric is decremented no matter what exception is raised.
"application": request_meta.app_name, | ||
}, | ||
) | ||
query = Query( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add try
here, so that we don't need to have incremented_queue_metric
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, that makes the code simpler. I made the change.
Signed-off-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
…isconnects (ray-project#37965) The `ray_serve_deployment_queued_queries` metric tracks the number of queries that have yet to be assigned a replica. If a client disconnects before its query has been assigned a replica– but after the metric has counted their query– the query terminates, but the metric doesn't decrease. This change decrements `ray_serve_deployment_queued_queries` when a queued request is disconnected. Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
…isconnects (#37965) (#38020) The `ray_serve_deployment_queued_queries` metric tracks the number of queries that have yet to be assigned a replica. If a client disconnects before its query has been assigned a replica– but after the metric has counted their query– the query terminates, but the metric doesn't decrease. This change decrements `ray_serve_deployment_queued_queries` when a queued request is disconnected. Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com> Co-authored-by: shrekris-anyscale <92341594+shrekris-anyscale@users.noreply.github.com>
…isconnects (ray-project#37965) The `ray_serve_deployment_queued_queries` metric tracks the number of queries that have yet to be assigned a replica. If a client disconnects before its query has been assigned a replica– but after the metric has counted their query– the query terminates, but the metric doesn't decrease. This change decrements `ray_serve_deployment_queued_queries` when a queued request is disconnected. Signed-off-by: NripeshN <nn2012@hw.ac.uk>
…isconnects (ray-project#37965) The `ray_serve_deployment_queued_queries` metric tracks the number of queries that have yet to be assigned a replica. If a client disconnects before its query has been assigned a replica– but after the metric has counted their query– the query terminates, but the metric doesn't decrease. This change decrements `ray_serve_deployment_queued_queries` when a queued request is disconnected. Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
…isconnects (ray-project#37965) The `ray_serve_deployment_queued_queries` metric tracks the number of queries that have yet to be assigned a replica. If a client disconnects before its query has been assigned a replica– but after the metric has counted their query– the query terminates, but the metric doesn't decrease. This change decrements `ray_serve_deployment_queued_queries` when a queued request is disconnected. Signed-off-by: Victor <vctr.y.m@example.com>
Why are these changes needed?
The
ray_serve_deployment_queued_queries
metric tracks the number of queries that have yet to be assigned a replica. If a client disconnects before its query has been assigned a replica– but after the metric has counted their query– the query terminates, but the metric doesn't decrease.This change decrements
ray_serve_deployment_queued_queries
when a queued request is disconnected.Related issue number
Closes #37943
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.test_telemetry
.py.