Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Exit on failures #7633

Closed
pseudotensor opened this issue Aug 18, 2024 · 3 comments
Closed

[Feature]: Exit on failures #7633

pseudotensor opened this issue Aug 18, 2024 · 3 comments
Labels
feature request New feature or request

Comments

@pseudotensor
Copy link

🚀 The feature, motivation and pitch

Commonly vllm will crash/fail at server/engine level, so no more requests can be made. However, while in some cases the entire server stops and (say in docker) one can auto restart to catch such rare failures, many times the failures leave the server in a mixed state and one cannot easily know if the server is healthy.

Sometimes even the health API says it's healthy, when it's not.

Alternatives

per-user crafted solutions to restart the server by detection of API behavior.

Additional context

Example case when left in ambiguous state when can't reach /v1 but server not shutdown: #7632

@pseudotensor pseudotensor added the feature request New feature or request label Aug 18, 2024
@ywang96
Copy link
Member

ywang96 commented Aug 18, 2024

I think #6594 should have solved this issue. Have you observed this issue happen from the latest main?

@pseudotensor
Copy link
Author

Oh cool, I didn't see that PR. I'll try.

@joe-schwartz-certara
Copy link

Im still observing the async engine dead error without the server going down. I don't think #6594 worked all the way. I'm using vllm 0.6.0 and a variety of models/gpu configurations. I still see the async engine dead error from time to time and I have to manually inspect the logs and manually restart to restore the prod environments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants