K8S: better error handling for evicted pods #711
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
From testing on our EKS cluster, it turns out
container_statuses
can sometimes be None for a failed pod. It looks like it happens when the pod was assigned to a node, haven't had a chance to start the containers yet, and gets evicted. In that case current version fails with a non-descriptive error ("NoneType not subsciptable" and a stack trace). This PR adds a handler for this case and prints a nicer error message, derived fromV1PodStatus.reason
:This condition is somewhat tricky to repro without running the full test suite, so for reference, here's how V1PodStatus looks like in those cases, as returned by K8S API: