-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
statsdreceiver: fix start timestamp / temporality for counters #5714
statsdreceiver: fix start timestamp / temporality for counters #5714
Conversation
Co-authored-by: Joshua MacDonald <jmacd@users.noreply.github.com>
Hi, the lines
was introduced by me when Prometheus Exporter is dropping all of these MetricAggregationTemporalityCumulative metrics. Wondering if you guys had the same issue with statsdreceiver -> prometheusexporter or just me? And if it was a real issue, the proposal would be fixing on the exporter side or the receiver one? |
As for this case, you would always treat the metric's AggregationTemporality spec as MetricAggregationTemporalityDelta at the end of the line and it will always go down to the line from prometheusexporter that drop it: opentelemetry-collector-contrib/exporter/prometheusexporter/accumulator.go Lines 168 to 171 in 772f39c
I'm assuming this is correct, just that the specifications for both side are different? |
@locmai Thanks for asking about this. The behavior was indeed going against the specifications for statsd and Prometheus (though if we're nit-picking, statsd isn't really specified). Statsd represents one of the major reasons for requiring Delta temporality in OTLP, but Prometheus accepts only Cumulative temporality. I apologize that something appears to be missing here. There should be some sort of processor that will take delta temporality and convert it to cumulative temporality. (I am a bit surprised. This ought to exist somewhere, and I thought it did. Maybe it's been factored into another metrics processor? Looking for (In the Prometheus ecosystem, the "statsd_exporter" equals a combination of statsd receiver, delta-to-cumulative processor, and a Prometheus exporter.) If we study the statsdreceiver code, might suggest another approach. If it would simply not reset its state on the collection interval, you'd have cumulative behavior. This looks like a simple change we could make, are you interested? |
@locmai As I read through the change again, now the what the old code was doing makes sense. (I had been confused!) The suggestion above should help. We can add a new configuration to the receiver called "isCumulative", in which case we will never reset the map. This option is orthogonal to the existing "isMonotonic", which dictates how to translate the statsd data. If we set isMonotonic and isCumulative, you'll get the Prometheus data you want, but new problems will emerge. The handling for Histogram/Timing points will create problems at least until statsd receiver is updated to use an exponential histogram, after which point we'll have a minor issue to resolve related to histogram entropy, see the Prometheus handling of histogram resets in their corresponding histogram structure, e.g. open-telemetry/opentelemetry-specification#2252 (comment) |
Hi! Thanks for the great explanation and suggestions. Yeah I thought we had this before
|
Re: (2) if you only change the behavior for counters, then summary will still behave oddly but you're right it will be less problematic. I'm trying to avoid building a list of every observation from the start of time. It looks like a third option is to change the way the prometheus exporter accumulates deltas. Currently it drops them, it could sum them instead. |
Description:
Fixes #5667. The statsdreceiver was not setting the start timestamp for counter metrics, and was conditionally setting the temporality as cumulative for some counters. This PR fixes both issues. The metric parser now tracks the last interval time and uses that as the start timestamp for counters. The translation code was also updated to only produce counters with delta temporality.
Link to tracking Issue: #5667
Testing: The
metric_translator_test
andstatsd_parser_test
were updated accordingly.Documentation: