-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metrics: Add metrics label filter configuration #1444
metrics: Add metrics label filter configuration #1444
Conversation
✅ Deploy Preview for tetragon ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
ad14d78
to
96a3a35
Compare
This looks reasonable to me. @lambdanis any opinions? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tackling this @nap32! The overall approach looks good to me. I left a few comments - mostly considering maintainability, I think we could make this feature a bit easier to use (when adding metrics) and extend.
Without analysing all possibilities in detail it's unclear to me what happens when a user passes a label that's not in KnownMetricLabelFilters
or passes labels in different order. Ideally, we would have some validation for such cases to prevent unexpected behaviour. But if it's a big effort, then having at least test cases covering them would be good, so that the behaviour is somehow documented.
Regarding tests, it looks like the global config change of MetricsLabelFilter
leaks into other tests breaking them?
Awesome! I had responses to two of the comments you left for feedback, happy to revise.
FilterMetricLabels expects that the last strings passed to it are the ordered labels/values in KnownMetricFilterLabels. I don't think it can be checked for ordering as it is implemented. If you'd prefer to make error handling more explicit,
I've corrected this and the other issues you've raised, thanks for pointing those out! |
96a3a35
to
6c6126c
Compare
129a569
to
3a37893
Compare
Thank you for the updates and responses @nap32. I think we should reuse Regarding edge cases - one thing I was considering is a user passing in Helm values unexpected labels, for example |
Got it! I've introduced the change.
I think there is a typo in the example above, but the expectation is correct.
|
3a37893
to
fae0286
Compare
Yes, thanks! |
Hi @lambdanis - I ran the suggested command from the lint helm chart job ( Do you have any suggestions on how I might troubleshoot the e2e tests? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nap32 I usually run Tetragon in a local kind cluster (see docs). Then using kubectl you can check why the pod is crashing, read the logs, etc.
I've run your PR locally, there are two main issues that need fixing: reading the config value and defining/registering metrics. I suggested how to approach them in comments. Let me know if something is unclear.
Thanks a lot for working on that! And sorry for the delayed review.
Working on testing the metric changes end-to-end --
prometheus-service.yaml
prometheus-deployment.yaml
prometheus-config.yaml
|
f2bcebb
to
164a14a
Compare
I'm not seeing the labels attached to metrics like I'd expect to, |
164a14a
to
861bb2b
Compare
3afbd0f
to
7bf5a69
Compare
@nap32 I see there are conflicts with the main branch, can you rebase your PR? It looks good to me now. I tested your changes locally and I saw the labels I expected. For a simple local test you can also access the metrics endpoint directly, without running Prometheus. Port-forward the service:
and then you can read the metrics in the text format at |
7bf5a69
to
892b25d
Compare
Yes!
Wow, this is a simple and straightforward way to test -- thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a single lint, but other than that looks good!
892b25d
to
b609b5e
Compare
Currently, metrics are all-or-nothing. Certain labels may cause cardinality issues. This patch introduces a new configuration option - MetricsLabelFilter. It is an allow-list for configuring namespace, workload, pod, and binary. Labels that utilize these fields will only add them if configured for it. Fixes: cilium#1037 Signed-off-by: Nick Peluso <10912027+nap32@users.noreply.github.com>
b609b5e
to
f140e83
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for contributing this! 🙇♀️
Currently, metrics are all-or-nothing.
Certain labels may cause cardinality issues.
This patch introduces a new configuration option - MetricsLabelFilter. It is an allow-list for configuring namespace, workload, pod, and binary. Labels that utilize these fields will only add them if configured for it.
Fixes: #1037