-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Serve] memory leak in Ray Serve 2.2.0 #31688
Comments
@Mitan may I ask, how does the script behave on ray==2.1.0? Very relevant to us right now. |
@mihajenko thanks for your reply - let me check and get back to you (by tomorrow should be ready) |
Hi @Mitan , I am not able to reproduce the issue in my dev box, not seeing memory increase. Are you able to narrow down which process is increasing on your side? BTW this is my send request script:
|
We were able to identify the root cause as a combination of multiple factors:
INFO 2023-01-27 01:19:07,461 http_proxy 10.245.21.150 http_proxy.py:315 - POST /admin 200 20237.0ms This cannot be silenced using existing guidelines from documentation, since I don't have access to HttpProxyActor constructor.
So adding log rotation for Serve (and potentially allowing to change default logging level for service deployments such as HttpProxyActor) should resolve the issue. |
Thanks @sihanwang41! |
I assume we can close this issue now? The log rotation has been enabled |
Hi @rkooo567, yes issue can be closed, thank you |
What happened + What you expected to happen
What happens: the memory of a simple Ray serve app with a single deployment keeps increasing when receiving requests. In 18 hours of continuous sending of requests the memory of the app more than doubles (see memory consumption from Prometheus). As a result, my app eventually runs OOM and crashes. I created a simple version of a Ray Serve app to reproduce the issue.
What you expected to happen: the memory should not increase.
Versions / Dependencies
Ray 2.2.0
Python 3.7
OS: Linux
Reproduction script
A simple version of Ray Serve app to reproduce the issue.
Code to start the cluster (start.sh)
Code for app.py
The polling script continuously and sequentlly sends requests in a synchronous manner (there is no queueing of requests). I can provide the code, if needed
Issue Severity
High: It blocks me from completing my task.
The text was updated successfully, but these errors were encountered: