-
-
Notifications
You must be signed in to change notification settings - Fork 759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Exit at startup failure on multiple workers #1115
Comments
Oh, nice to know, I bumped into this issue while writing some tests to improve the coverage, I wasn't sure it was something expected or a bug, but I added a note to check it later when I have more time. This change will probably require a minor bump instead of a patch, since someone can be relying on this hanging behaviour at their daemons. Even though the app isn't really up, careless configured daemons can peak processing by infinity loops of app reboots. In my tests, I noticed it hangs even sooner at the startup process, for example, if you miss-configure the app. It fails to load the app and hangs instead of exiting:
The same for the reload case:
|
More info on the case, on my tests passing the wrong name the workers dies at the following lines of the configuration loading: https://github.com/encode/uvicorn/blob/master/uvicorn/config.py#L449-L451 The problem with crashes that happen inside the configuration loading is that the lifetime communication with the parent process isn't set up at this moment, so the parent never gets notified. Trying to make some notifications after the setup of the lifetime doesn't seem to have much effect either: https://github.com/encode/uvicorn/blob/master/uvicorn/server.py#L77-L79 Do you know the proper way to communicate a shutdown event to the parent process? I couldn't find a good IPC example at the codebase to try out in this situation. |
I would check how gunicorn and hypercorn are doing it. |
For more info on the case, I checked both Hypercorn and Gunicorn. In Hypercorn it seems to have a similar issue to the one described here. For Gunicorn it works perfectly, but there's a difference in the way they handle the workers. One thing that could be brought from there is to make sure all lifespan setup happens during the spawning process of the worker instead of having parts of its loading happening during the call to |
The reload part is fixed, but not the multiple workers one. |
Is there any update on this? I have the same problem when the app crashes, Kubernetes does not recognize it and also does not restart it. Maybe a workaround? |
The fix on the reload is on máster, we need to release it. |
I need it for multiple workers. I don't use reload on production. |
You shouldn't be using uvicorn's In any case, a workaround can be found on the first commit of #1177. |
Thanks for that hint |
It's fine to fail and don't exit on reload. The problem on the |
Just want to point out that this is still an issue on |
The conclusion here is that it's fine to fail and don't exit on reload. It's a feature. |
Well... I don't understand, it doesn't recover and is stuck in that state permanently. To be clear, I'm speaking to the reload scenario described in the opening post, not the workers scenario. What part of this is a feature? Does it help me debug in some way? |
Yes. |
Just to give more details... There are two ways to see this problem:
The thing is that we have a better user experience with 2. In other words, there's no issue with Well... I'm closing this because I don't think people are using |
Checklist
master
.Describe the bug
When the startup fails, either on reload mode or multiple workers, the main process is not terminated.
To reproduce
Application:
uvicorn test:app --workers 2 # or uvicorn test:app --reload
Expected behavior
All the processes (parent and children) should be terminated.
Actual behavior
After this log, it hangs forever. On the
reload
, it doesn't prevent the failure from happening multiple times if you modify a watched file. See:In the case of multiple workers, it just hangs forever.
Environment
The text was updated successfully, but these errors were encountered: