-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prometheus metrics & mixin #375
Conversation
- `_unseal_requests_total` Counter is incremented every time the `unseal()` function is called. Note that some Kubernetes events like deleting a `SealedSecret` results in this function being called and the counter incremented. Not bad though. This Counter gives the best indiciation of controller activity. - `_unseal_errors_total{reason="<reason>"}` Counter with labels for each of the unseal failure cases; fetch, status, unmanaged, unseal, update The default retry behaviour of 5 times in quick succession tends to result in the Counter being incremented 5 times in event of a failure and user action is required to rectify, eg: reseal, RBAC, etc. - `build_info` with `revision` set to `VERSION` from git/tag All Counters are initialized to 0 during init.
- Basic alerts that will fire if there are any unseal error counter increments. In my experience these all require user action. - Tests for alerts - Simple dashboard plotting total requests and errors. Could be extended with `build_info` panel/etc.
prometheus-mixin/Makefile
Outdated
@@ -0,0 +1,56 @@ | |||
# Prometheus Mixin Makefile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's put prometheus-mixin
in a contrib
top level directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also please add a README.md file in this directory explaining what it is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, updated.
unfortunately the tests fail because of the way multiarch docker images are currently built. the official docker ecosystem is often very automation unfriendly; there are alternative tools that will allow us to build images and manifests offline without having to push them to a registry; I'll try to play with that and regain the ability to run integration tests in unprivileged PRs |
@kskewes, I'm happy to merge this and apply further cleanups myself. Please clarify why you marked this PR as "WIP", do you plan pushing more changes later or it's just to signal that it's not "done" done? |
Just to signal it's not done done. Thanks for the quick feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Rebased Prometheus PR on master.
Closes: #177
/metrics
.There aren't any sensitive endpoints on this port, but if metrics are considered sensitive for some then I can move to a custom port.
build_info
andgo_build_info
metricsPodMonitor
jsonnet for those using Prometheus Operator (us)Additional thoughts:
make test
is currently failing for me on master as well as theprometheus
branch:--- FAIL: TestHttpCert (1.37s)
Testing done:
kind