Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFD 197 - Prometheus metrics guidelines #51139

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

hugoShaka
Copy link
Contributor

Following the addition of the internal metric registry in Teleport, here's a mini RFD about how to add metrics in teleport.

Rendered version.

@github-actions github-actions bot requested review from atburke and tigrato January 16, 2025 21:37
@github-actions github-actions bot added rfd Request for Discussion size/md labels Jan 16, 2025

#### Do

- Use `teleport.MetricsNamespace` as the namespace.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this to be true for every executable? Should tbot metrics be prefixed by tbot?

Copy link
Contributor

@rosstimothy rosstimothy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any gotchas in converting existing metrics from the global registry to a local registry?

rfd/0197-prometheus-metrics.md Outdated Show resolved Hide resolved
rfd/0197-prometheus-metrics.md Show resolved Hide resolved
rfd/0197-prometheus-metrics.md Show resolved Hide resolved

#### Do

- Use `teleport.MetricsNamespace` as the namespace.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any strategies for migrating legacy metrics which do not have a namespace? Should we register the same metric with and without the namespace?

Copy link
Contributor Author

@hugoShaka hugoShaka Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently I did not suggest a migration strategy because this is a pretty disruptive change. I don't like double-registering metrics very much because it increases cardinality.

I guess we could pull a metric breaking change in a major version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be terrible to double register the same metric in major version N and announce that the non-namespaced variants will be removed in version N+2? That would give people ~8 months of notice to adjust to use the correct metrics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's acceptable, maybe we can gate the behaviour behind a TELEPORT_UNSTABLE_ env var or something, so if someone has an issue with this they can disable the duplication and use only the new or old metrics.

@hugoShaka hugoShaka force-pushed the rfd/0197-prometheus-metrics branch from 52d7674 to 164e16e Compare January 16, 2025 22:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfd Request for Discussion size/md
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants