-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarification on MetricReader registered on MeterProvider #2072
Comments
I share the same feeling here. The specification seems to suggest that the meter provider is the central place to manage meters and readers. But readers seem to be a wrapper around meter providers and should be defined in a higher level. And more specifically, I’d suggest define the shut_down and force_flush on readers independently. |
Disucssed during the 11/2/2021 Spec SIG Meeting:
Currently the spec doesn't specify if the same MetricReader instance can be shared across multiple MeterProvider instances. In practice, different languages use different approach:
The control flow is bi-directional, and it is intentional - to allow both Push and Pull mode. |
I'm starting to understand this problem. I see the problem not about "ownership", but the apparent bi-directionality of control flow and something missing from the text to explain it. Comment by @jack-berg that partly gets at the issue #2304 (comment). A "decision to collect" comes from one of three places:
A "decision to read" means for an Exporter to read through the MetricReader state. This always follows one of the three reasons to begin collection.
To summarize, the SDK or MeterProvider owns a write path while the Metric exporter owns a read path. I do not believe there's a cycle here, but potentially there's some way to improve the text. |
This connects with #2232, since the underlying request there is to have a way to allow whole Meter instances and the associated metric streams to come and go (i.e., be forgotten) at runtime. Due to current uncertainty in the spec, it might seem this can be accomplished if a Reader can be registered to multiple providers. I will argue the other way, because this can also be accomplished if an Exporter can be registered to multiple providers through multiple Reader instances. First questionThe difference in these two ways of framing the problem, implicitly, is whether "some kind of aggregation happens?" when the same streams are produced by more than one MeterProvider. If this is a choice--whether we support "one-to-many" for Exporters or for Readers--then supporting one-to-many for Exporters answers "No" to this question and supporting one-to-many for Readers answers "Yes" to this question about aggregation across providers. I personally want to answer No to this question, I would be glad to see one Exporter to many providers. The user and/or the provider configuration is responsible for avoiding single-writer rule violations in this setup. In other words, multiple providers to one exporter is fine with me, but the user is required to ensure that streams do come into conflict. Second questionThis leads to an orthogonal question, about whether OTel implementations SHOULD / SHOULD NOT support the use of multiple Resource values in a single exporter configuration. Support for multiple Resources allows easily avoiding single-writer violations, but seems to encourage a single process to break norms about Resource usage. If this is a question, I would specify that OTel implementations SHOULD NOT support the use of multiple Resource values in SDKs for a single Exporter, even where multiple MeterProviders are in use. This leads to a requirement that the the user is responsible for avoiding single-writer violations by the use of distinct attribute sets, i.e., ensure they do not create the same stream in multiple MeterProviders bound to a single Exporter, or else there will be duplicate streams (b/c no aggregation). Rationale in support of one Exporter to many MeterProvidersHere is my reasoning for the "No" answer to the original question about whether to aggregate instead of allowing single-writer violations to pass through. We need a way to support "forgetting" of metric streams in configurations like the one posed in #2232, because this is a fairly common pattern (which Prometheus also has). I answer "No" to the question about aggregation, because then it gives us a solution for stream deletion, as follows. In typical pull-based cumulative configuration, a Metric Reader is expected to remember streams forever. If shutting down a MeterProvider means shutting down all of the associated Readers, then we get stream deletion support "for free". If a MeterReader is expected to aggregate streams from multiple providers, then shutting down a MeterProvider does not really imply forgetting any streams. Thus, I answer "No" to the question about aggregation in one-to-many provider configurations, because aggregation across providers conflicts with the desire to support forgetting streams. Proposed solutionI think we could resolve this issue as follows:
Neither the Reader, the Exporter, nor the MeterProvider in such a one-to-many configuration is responsible for detecting single-writer violations or duplicate instrument conflicts. I might add:
|
Discussed during the 3/1, 2022 Spec SIG Meeting, we consider this as a documentation issue which requires clarification regarding the MetricReader ownership. The conclusion is to "specify that MetricReaders MUST have a one-to-one relationship with MeterProviders", which would resolve this issue. And once this is resolved, it would unblock the SDK spec to move to Mixed (with the only exception that Exemplar is still Feature-Freeze). These two items are related to @2232 and will be marked as "Future":
|
This sounds good to me, thanks! |
This comes back to a similar problem for exporters: MeterProvider.shutdown MUST be implemented at least by invoking Shutdown on all registered MetricReader and MetricExporter. It could be confusing if an exporter is registered to multiple MeterProvider and one of the MeterProvider shuts down while others are not. |
I see. I looked into the trace SDK specification. The TracerProvider Shutdown() is specified to call Shutdown() on each Processor.
|
It looks like the trace SDK specification is missing a statement to require the Span Processor.Shutdown() method to shut down the exporter. At least the OTel-Go trace SDK processor will shut down the exporter in this case, so it would be a breaking change not to add that requirement. I think this means we should add to the trace SDK spec:
I think this means we should add to the metrics SDK spec:
For the question posed in our SIG meeting, "How can a single Prometheus exporter export for multiple MeterProviders?" we still do not have a great answer, but here's one that's hopefully acceptable. Users in this situation will create a many-to-one wrapper around the Metric Exporter that ignores Shutdown(). When scraped, it will Collect() every sub-exporter's MetricReader. The user in this situation will arrange to call Shutdown() once on the actual exporter. @bogdandrutu, who posed the Prometheus question, do agree with the above? |
it would also be the responsibility of the wrapper to ensure metrics do not collide (by prefixing metric names for example, or adding a provider id as a label)? |
What are you trying to achieve?
Clarify the object ownerships between metric reader and meter provider.
In the example (https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#view), it is said the metric reader is constructed in place and then it is registered to the meter provider:
Also, it is stated that metric reader has a
Collect
operation that triggers asynchronous instrument callbacks:As far as I could tell, all the instruments are owned by the meter provider, and thus the
collect
should consult the meter provider to perform async instrument callbacks.In this sense, it seems that the reader can be registered to multiple meter provider and the
metric_reader.collect
should perform collection on all registered meter provider:The problem here is that the SDK API interface suggested that we are registering the reader to the meter provider, however, the
metric_reader.collect
reverse the control ownership and the meter provider shall be registered on the metric reader and thus it is possible to callmetric_reader.collect
to perform a collection on all registered async instrument callbacks.This sounds like a doubly linked ownership:
What did you expect to see?
The ownership should not be bidirectional.
The text was updated successfully, but these errors were encountered: