-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for flattening attributes from OTLP messages #2736
Changes from 4 commits
5f391b2
66e04f3
4c16f3b
c631230
69d8501
5f6f526
89ebba0
c51a2d3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,177 @@ | ||
# Attribute precedence on transformation to non-OTLP formats | ||
|
||
**Status**: [Experimental](../document-status.md) | ||
|
||
<details> | ||
<summary>Table of Contents</summary> | ||
|
||
<!-- toc --> | ||
|
||
- [Overview](#overview) | ||
- [Attribute hierarchy in OTLP messages](#attribute-hierarchy-in-otlp-messages) | ||
- [Precedence per Signal](#precedence-per-signal) | ||
* [Traces](#traces) | ||
* [Metrics](#metrics) | ||
* [Logs](#logs) | ||
* [Span Links, Span Events and Metric Exemplars](#span-links-span-events-and-metric-exemplars) | ||
- [Considerations](#considerations) | ||
- [Example](#example) | ||
- [Useful links](#useful-links) | ||
|
||
<!-- tocstop --> | ||
|
||
</details> | ||
|
||
## Overview | ||
|
||
This document provides supplementary guidelines for the attribute precedence | ||
that exporters should follow when translating from the hierarchical OTLP format | ||
to non-hierarchical formats. | ||
|
||
A mapping is required when flattening out attributes from the structured OTLP | ||
format, which has attributes at different levels (e.g., Resource attributes, | ||
InstrumentationScope attributes, attributes on Spans/Metrics/Logs) to a | ||
non-hierarchical representation (e.g., Prometheus/OpenMetrics labels). | ||
In the case of OpenMetrics, the set of labels is flat and must have unique | ||
labels only | ||
(<https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#labelset>). | ||
Since OpenTelemetry allows for different levels of attributes, it is possible | ||
that the same attribute appears multiple times on different levels. | ||
|
||
This document provides guidance on how OpenTelemetry attributes can be | ||
consistently mapped to flat sets. | ||
|
||
## Attribute hierarchy in OTLP messages | ||
|
||
Since the OTLP format is a hierarchical format, there is an inherent order in | ||
the attributes. | ||
In this document, | ||
[Resource](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/resource/sdk.md) | ||
attributes are considered to be at the top of the hierarchy, since they apply to | ||
all collected telemetry. | ||
Attributes on individual Spans/Metric data points/Logs are at the bottom of the | ||
hierarchy, as they are most specialized and only apply to a subset of all data. | ||
|
||
**A more specialized attribute that shares an attribute key with more general | ||
attribute will take precedence and overwrite the more general attribute.** | ||
|
||
When de-normalizing an OTLP message into a flat set of key-value pairs, | ||
attributes that are present on the Resource and InstrumentationScope levels will | ||
be attached to each Span/Metric data point/Log. | ||
|
||
## Precedence per Signal | ||
|
||
Below, the precedence for each of the signals is spelled out explicitly. | ||
Only spans, metric data points and log records are considered. | ||
|
||
`A > B` denotes that the attribute on `A` will overwrite the attribute on `B` | ||
if the keys clash. | ||
|
||
### Traces | ||
|
||
``` | ||
Span.attributes > ScopeSpans.scope.attributes > ResourceSpans.resource.attributes | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is a ScopeSpan? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ScopeSpans is the wrapper combining multiple spans and a Scope: https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/trace/v1/trace.proto#L64 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see. If I were implementing a flattening scheme, I would keep ALL attributes, e.g. by adding distinguishable prefix for each category, instead of letting them override each other. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @yurishkuro yes. Another way of putting this: Why flatten in the first place? The fact that OpenMetrics specifies a "flat" set of attributes does not mean we should flatten resource and scope-level attributes. OpenMetrics specifies how to join resource attributes using the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
No, I don't have an issue with flattening, it may be necessary due to the limitations of a target telemetry platform. But it does not mean that flattening should be a lossy transformation, which is what you're proposing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The big argument against adding a prefix is that it changes the attribute key. Users can not query for the attribute they added, since the name changed. This impacts semantic conventions as well: Should all semantic conventions be prefixed? I feel like we would need to do that to stay consistent.
That is true, but this is the idea behind this proposal: If you want to, you can overwrite attributes. If you don't want to overwrite attributes, you can rename them (e.g., add a prefix explicitly). In either case, you will get the attributes that you defined, and don't have to go looking for the renamed version. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I haven't thought of doing renaming instead of overwriting, but it is worth considering as an option. A couple thoughts that may help:
If we could have real examples where attribute names can conflict it would help to make a decision. The If opinions are split on this it is also possible to make this behavior configurable (i.e. to overwrite or to prefix) but I would try to avoid this complication if possible. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is a good point. I buy the logic that conflicts are unlikely to be common, and thus that this proposal is going to work reasonably well in most situations. If a conflict does arise, the user has an escape hatch with the ability to wrap the exporter with logic that adds a prefix to the key in conflict. |
||
``` | ||
|
||
### Metrics | ||
|
||
Metrics are different from Spans and LogRecords, as each Metric has a data field | ||
which can contain one or more data points. | ||
Each data point has a set of attributes, which need to be considered | ||
independently. | ||
|
||
``` | ||
Metric.data.data_points.attributes > ScopeMetrics.scope.attributes > ResourceMetrics.resource.attributes | ||
``` | ||
|
||
### Logs | ||
|
||
``` | ||
LogRecord.log_records.attributes > ScopeLogs.scope.attributes > ResourceLogs.resource.attributes | ||
``` | ||
|
||
### Span Links, Span Events and Metric Exemplars | ||
|
||
> Span Links, Span Events and Metric Exemplars need to be considered | ||
> differently, as conflicting entries there can lead to problematic data loss. | ||
|
||
Consider a `http.host` attribute on a Span Link, which identifies the host of a | ||
linked Span. | ||
Following the "more specialized overwrites more general" suggestion leads to | ||
overwriting the `http.host` attribute of the Span, which is likely desired | ||
information. | ||
Transferring attributes on Span Links, Span Events and Metric Exemplars should | ||
be done separately from the parent Span/Metric data point. | ||
This is out of the scope of these guidelines. | ||
|
||
## Considerations | ||
|
||
Note that this precedence is a strong suggestion, not a requirement. | ||
Code that transforms attributes should follow this mode of flattening, but may | ||
diverge if they have a reason to do so. | ||
|
||
## Example | ||
|
||
The following is a theoretical YAML-like representation of an OTLP message which | ||
has attributes with attribute names that clash on multiple levels. | ||
|
||
```yaml | ||
ResourceMetrics: | ||
resource: | ||
attributes: | ||
# key-value pairs (attributes) on the resource | ||
attribute1: resource-attribute-1 | ||
attribute2: resource-attribute-2 | ||
attribute3: resource-attribute-3 | ||
service.name: my-service | ||
|
||
scope_metrics: | ||
scope: | ||
attributes: | ||
attribute1: scope-attribute-1 | ||
attribute2: scope-attribute-2 | ||
attribute4: scope-attribute-4 | ||
|
||
metrics: | ||
# there can be multiple data entries here. | ||
data/0: | ||
data_points: | ||
# each data can have multiple data points: | ||
data_point/1: | ||
attributes: | ||
# will overwrite scope and resource attribute | ||
attribute1: data-point-1-attribute-1 | ||
|
||
data_point/2: | ||
attributes: | ||
# will overwrite | ||
attribute1: data-point-2-attribute-1 | ||
attribute4: data-point-2-attribute-4 | ||
``` | ||
|
||
The structure above contains two data points, thus there will be two data points | ||
in the output. | ||
Their attributes will be: | ||
|
||
```yaml | ||
# data point 1 | ||
service.name: my-service # from the resource | ||
attribute1: data-point-1-attribute-1 # overwrites attribute1 on resource & scope | ||
attribute2: scope-attribute-2 # overwrites attribute2 on resource | ||
attribute3: resource-attribute-3 # from the resource, not overwritten | ||
attribute4: scope-attribute-4 # from the scope, not overwritten | ||
|
||
# data point 2 | ||
service.name: my-service # from the resource | ||
attribute1: data-point-2-attribute-1 # overwrites attribute1 on resource & scope | ||
attribute2: scope-attribute-2 # overwrites attribute2 on resource | ||
attribute3: resource-attribute-3 # from the resource, not overwritten | ||
attribute4: data-point-2-attribute-4 # overwrites attribute4 from the scope | ||
``` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Asking just for my understanding. Can you list a real life example of an attribute that could be present both at the resource and at the level of a span/metric/log_record? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One thing that comes to mind is a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, I used to think we will never have same attribute in both the resource and the signals since they serve a different purpose, but looks like there can be a few valid cases. I want to question and understand your example though. The team responsible for running my service pod could be devops (specified at resource level), but the team responsible for the application is an engineering/dev team (specified in the span). Don't you want to capture both? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree, that they should never overlap, but since its possible to put whatever you want on the attributes it might happen. For the example, I thought about it more like the team that developed the code. The team running the code is separate in my opinion. I was thinking of an attribute value of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @scheler does this answer your question? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
||
## Useful links | ||
|
||
* [Trace Proto](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/trace/v1/trace.proto) | ||
* [Metrics Proto](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/metrics/v1/metrics.proto) | ||
* [Logs Proto](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/logs/v1/logs.proto) | ||
* [Resource Proto](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/resource/v1/resource.proto) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @jmacd points out here, there is ongoing work to solve this in a different way specifically for prometheus/openmetrics in #2703. It would be better to use zipkin as an example here.