-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubernetes_state: plumb more container waiting reasons #1763
Conversation
We'd like to create monitors that fire when containers are stuck waiting for various reasons. Two particular reasons, ImagePullBackoff and CrashLoopBackoff, can be used to detect bad or broken deployments. These have been plumbed as of kube-state-metric 1.3 but are not currently whitelisted in the DataDog agent integration. The tests have also been update with fixture data. Signed-off-by: Stephen Day <stephen.day@getcruise.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's great, thanks @stevvooe !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
thanks @stevvooe ! This will go out with |
Could Why's there a whitelist in the first place? I see some discussion in #853, but don't understand the reason for skipping metrics with new reasons instead of simply passing the reason through. |
@deiwin I think the whitelist is reduce the amount of metric volume that may be ignored or unused. I think you could easily add it with a PR like this one. I only focused on the failure scenarios, as those are the most problematic. What would be the use case of monitoring |
With the CNI linked to above, pods can get stuck in the |
We'd like to create monitors that fire when containers are stuck waiting
for various reasons. Two particular reasons, ImagePullBackoff and
CrashLoopBackoff, can be used to detect bad or broken deployments. These
have been plumbed as of kube-state-metric 1.3 but are not currently
whitelisted in the DataDog agent integration. The tests have also been
update with fixture data.
Signed-off-by: Stephen Day stephen.day@getcruise.com