-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod gets stuck forever in pending state if it was deleted before initialisation #2711
Comments
Looks like my issue can be fixed by enabling |
I just had a Pod that was deleted due to node failure. Usually, Argo fails the step with
The pod Normal behavior:
Both are from 2.7.2, so it seems to be timing / race related. |
Checklist:
What happened:
In some cases kubernetes deletes pods before or right after it was initialized and argo misses that event - resulting in node stuck forever in "Pending" state. In our case it happens often under the load. According to our logs kube-scheduler deletes some pods shortly after they have been instantiated.
What you expected to happen:
Node changes phase to "Error" or "Failed".
How to reproduce it (as minimally and precisely as possible):
hello-world-xxxxx
Anything else we need to know?:
Before this PR - #2385
this case was detected and pod's state changed to "Error"
Environment:
Other debugging information (if applicable):
Logs
Message from the maintainers:
If you are impacted by this bug please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.
The text was updated successfully, but these errors were encountered: