Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(internal-metrics): Add gauge support #29091

Merged
merged 3 commits into from
Oct 8, 2021
Merged

feat(internal-metrics): Add gauge support #29091

merged 3 commits into from
Oct 8, 2021

Conversation

relaxolotl
Copy link
Contributor

@relaxolotl relaxolotl commented Oct 5, 2021

There's a metric that the native team would like to record as a gauge instead of a counter, as it is meant to track a snapshot of a value at some given period of time instead of the total number of occurrences of some event. For those who are familiar with the symbolicator reliability project, I'd like to track the number of projects assigned to the currently-disabled LPQ to get an idea of its behaviour before rolling it out to production.

This adds in baseline support for gauges, mostly based on what's already there.

Copy link
Contributor

@tonyo tonyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious: did you consider using timing for capturing the values that you want? The name is not the best, but it's now used across the entire codebase to capture any kind of "snapshot" data (not only times).

All that said, I don't really mind adding an explicit gauge function.

current_tags.update(tags)

try:
backend.gauge(key, instance, current_tags, value, sample_rate)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order of arguments looks a bit off here

current_tags.update(tags)

try:
backend.gauge(key, instance, current_tags, value, sample_rate)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
backend.gauge(key, instance, current_tags, value, sample_rate)
backend.gauge(key, value, instance, current_tags, sample_rate)

@@ -9,3 +9,6 @@ def incr(self, key, instance=None, tags=None, amount=1, rate=1):

def timing(self, key, value, instance=None, tags=None, rate=1):
pass

def gauge(self, key, value, instance=None, tags=None, sample_rate=1):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, probably wouldn't hurt to make the last argument "rate", or change all the others to "sample_rate".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for catching this, i'll change the others to sample_rate since all of the other backends as well as the base implementation use sample_rate.

@relaxolotl
Copy link
Contributor Author

Curious: did you consider using timing for capturing the values that you want? The name is not the best, but it's now used across the entire codebase to capture any kind of "snapshot" data (not only times).

I actually didn't consider this, thanks for bringing this up! Knowing that, even though in timings are currently being used to capture gauge-like data I'd like to commit to using gauges for the following reasons:

  • Using timings gives us access to additional metrics such as max, min, avg, etc. which unfortunately aren't too useful for the metric I'm thinking of capturing. Such values are useful if we're looking at them over a longer period of time across multiple measurements.
  • Having access to a gauge also conveys intent better, and reduces confusion about the metric.

Symbolicator and Relay also already use gauges, which means that there's already some precedence in Sentry for this metric type. This isn't introducing a new metric type, which gives me some confidence in using these.

This does raise the question of what we'd like to do with existing metrics that currently use timings as makeshift gauges: Should we move those over once this is in? Do we just do one big clean sweep over all metrics that fall under this category, and switch them all over? Do we leave them be, and migrate them "as needed" based on some reasonable definition of "as needed"?

I'm thinking we leave the existing ones as is since swapping them over may break some graphs or cause strange issues. Thoughts? @tonyo

@tonyo
Copy link
Contributor

tonyo commented Oct 8, 2021

Using timings gives us access to additional metrics such as max, min, avg, etc. which unfortunately aren't too useful for the metric I'm thinking of capturing

FWIW, eventually the collected data will still be somehow aggregated when being displayed (e.g. in Datadog) due to the limited number of data points you can place on a graph. So in any case you'll have to pick what is the "final" aggregration/rollup is applied to your data (which is avg by default, but can also be min and max). So generally, I'm pretty sure "timing" would be enough for this use case, but as mentioned earlier, it might be non-intuitive/confusing, so let's add "gauge" and see how it goes.

I'm thinking we leave the existing ones as is since swapping them over may break some graphs or cause strange issues

Agree, it's not worth it.

@relaxolotl relaxolotl merged commit e94fc7a into master Oct 8, 2021
@relaxolotl relaxolotl deleted the feat/add-gauge branch October 8, 2021 14:15
@github-actions github-actions bot locked and limited conversation to collaborators Oct 24, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants