You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Commonly vllm will crash/fail at server/engine level, so no more requests can be made. However, while in some cases the entire server stops and (say in docker) one can auto restart to catch such rare failures, many times the failures leave the server in a mixed state and one cannot easily know if the server is healthy.
Sometimes even the health API says it's healthy, when it's not.
Alternatives
per-user crafted solutions to restart the server by detection of API behavior.
Additional context
Example case when left in ambiguous state when can't reach /v1 but server not shutdown: #7632
The text was updated successfully, but these errors were encountered:
Im still observing the async engine dead error without the server going down. I don't think #6594 worked all the way. I'm using vllm 0.6.0 and a variety of models/gpu configurations. I still see the async engine dead error from time to time and I have to manually inspect the logs and manually restart to restore the prod environments.
🚀 The feature, motivation and pitch
Commonly vllm will crash/fail at server/engine level, so no more requests can be made. However, while in some cases the entire server stops and (say in docker) one can auto restart to catch such rare failures, many times the failures leave the server in a mixed state and one cannot easily know if the server is healthy.
Sometimes even the health API says it's healthy, when it's not.
Alternatives
per-user crafted solutions to restart the server by detection of API behavior.
Additional context
Example case when left in ambiguous state when can't reach /v1 but server not shutdown: #7632
The text was updated successfully, but these errors were encountered: