-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(internal-metrics): Add gauge support #29091
Conversation
ac11b88
to
8d8a465
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious: did you consider using timing
for capturing the values that you want? The name is not the best, but it's now used across the entire codebase to capture any kind of "snapshot" data (not only times).
All that said, I don't really mind adding an explicit gauge function.
src/sentry/utils/metrics.py
Outdated
current_tags.update(tags) | ||
|
||
try: | ||
backend.gauge(key, instance, current_tags, value, sample_rate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order of arguments looks a bit off here
src/sentry/utils/metrics.py
Outdated
current_tags.update(tags) | ||
|
||
try: | ||
backend.gauge(key, instance, current_tags, value, sample_rate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
backend.gauge(key, instance, current_tags, value, sample_rate) | |
backend.gauge(key, value, instance, current_tags, sample_rate) |
@@ -9,3 +9,6 @@ def incr(self, key, instance=None, tags=None, amount=1, rate=1): | |||
|
|||
def timing(self, key, value, instance=None, tags=None, rate=1): | |||
pass | |||
|
|||
def gauge(self, key, value, instance=None, tags=None, sample_rate=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, probably wouldn't hurt to make the last argument "rate", or change all the others to "sample_rate".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for catching this, i'll change the others to sample_rate
since all of the other backends as well as the base implementation use sample_rate
.
I actually didn't consider this, thanks for bringing this up! Knowing that, even though in timings are currently being used to capture gauge-like data I'd like to commit to using gauges for the following reasons:
Symbolicator and Relay also already use gauges, which means that there's already some precedence in Sentry for this metric type. This isn't introducing a new metric type, which gives me some confidence in using these. This does raise the question of what we'd like to do with existing metrics that currently use timings as makeshift gauges: Should we move those over once this is in? Do we just do one big clean sweep over all metrics that fall under this category, and switch them all over? Do we leave them be, and migrate them "as needed" based on some reasonable definition of "as needed"? I'm thinking we leave the existing ones as is since swapping them over may break some graphs or cause strange issues. Thoughts? @tonyo |
FWIW, eventually the collected data will still be somehow aggregated when being displayed (e.g. in Datadog) due to the limited number of data points you can place on a graph. So in any case you'll have to pick what is the "final" aggregration/rollup is applied to your data (which is avg by default, but can also be min and max). So generally, I'm pretty sure "timing" would be enough for this use case, but as mentioned earlier, it might be non-intuitive/confusing, so let's add "gauge" and see how it goes.
Agree, it's not worth it. |
There's a metric that the native team would like to record as a gauge instead of a counter, as it is meant to track a snapshot of a value at some given period of time instead of the total number of occurrences of some event. For those who are familiar with the symbolicator reliability project, I'd like to track the number of projects assigned to the currently-disabled LPQ to get an idea of its behaviour before rolling it out to production.
This adds in baseline support for gauges, mostly based on what's already there.