From 4940ad3dcd9eb531ce81eddb9ad31c567b301fb6 Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Mon, 23 Oct 2023 17:13:39 -0700 Subject: [PATCH 1/3] Update --- docs/otel/README.md | 10 ++- docs/otel/export-metrics.md | 155 +++++++++++++++++++++++++++++++----- model/metrics/otel.yaml | 39 +-------- 3 files changed, 143 insertions(+), 61 deletions(-) diff --git a/docs/otel/README.md b/docs/otel/README.md index 1814a1b101..a20de70e7c 100644 --- a/docs/otel/README.md +++ b/docs/otel/README.md @@ -2,10 +2,16 @@ **Status**: [Experimental][DocumentStatus] -This document defines semantic conventions for OTel components (such as processors, exporters, etc). +This document defines semantic conventions for OpenTelemetry +data-reporting components such as processors, exporters. These +components are generally specified an OpenTelemetry SDK specification, +for example [Span +Processor](https://opentelemetry.io/docs/specs/otel/trace/sdk/#span-processor) +and [Span +Exporter](https://opentelemetry.io/docs/specs/otel/trace/sdk/#span-exporter). OTel Component semantic conventions are defined for the following metrics: -* [Export](export-metrics.md): For export level metrics. +* [Export](export-metrics.md): For export-level metrics. [DocumentStatus]: https://github.com/open-telemetry/opentelemetry-specification/blob/v1.22.0/specification/document-status.md diff --git a/docs/otel/export-metrics.md b/docs/otel/export-metrics.md index 332b83ebc2..b97ac05e88 100644 --- a/docs/otel/export-metrics.md +++ b/docs/otel/export-metrics.md @@ -1,13 +1,13 @@ -# Semantic Conventions for OTel Export Metrics +# Semantic Conventions for OpenTelemetry Export Metrics **Status**: [Experimental][DocumentStatus] -This document describes instruments and attributes for OTel -Export level metrics. Consider the [general metric semantic +This document describes instruments and attributes for OpenTelemetry +collection-level metrics. Consider the [general metric semantic conventions](README.md#general-metric-semantic-conventions) when creating instruments not explicitly defined in the specification. @@ -16,49 +16,162 @@ instruments not explicitly defined in the specification. - [Metric Instruments](#metric-instruments) - * [Metric: `otel.processor.spans`](#metric-otelprocessorspans) - * [Metric: `otel.exporter.spans`](#metric-otelexporterspans) + * [Metric: `otel.processor.items`](#metric-otelprocessoritems) + * [Metric: `otel.exporter.items`](#metric-otelexporteritems) +## Principles used + +This specification defines three levels of detail possible, allowing +for components to be used with `basic`, `normal`, and `detailed` +levels that determine which attributes are kept or removed, as by a +[Metric +view](https://opentelemetry.io/docs/specs/otel/metrics/sdk/#view). We +rely on a conservation principle for pipelines, which states generally +that what goes in, comes out. + +### Signal-independent metric names + +OpenTelemetry currently has 3 signal types, but it may add more. +Instead of using the signal name in the metric names, we opt for a +general-purpose noun that usefully describes any signal. The +signal-agnostic term used here is `items`, referring to spans, log +records, and metric data points. An attribute to distinguish the +`signal` is used, with names designated by the project `traces`, +`logs`, and `metrics`. + +Users are expected to understand that the data item for traces is a +span, for logs is a record, and for metrics is a point. + +### Distinguishing Collectors from SDKs + +SDKs and Collectors process the same data in a pipeline, and both +OpenTelemetry Collector and SDKs are recommended to use the metric +names specified here. An attribute to distinguish the `domain` is +used, with values like `sdk`, `collector`. + +In a multi-level collection pipeline, each layer is expected to use a +unique domain. This enables calculating aggregates at each level in +the collection pipeline and comparing them, as a measure of aggregate +leakage. Multi-level colletor topologies should allow configuration +of distinct domains (e.g., `agent` and `gateway`). + +### Basic level of detail + +At the basic level of detail, we only need to know what goes in to a +component, because we are able to infer a lot about the component by +comparing its metrics with the next component in the pipeline. By +conservation, for example, any items that are received by a SDK +processor component and are not received by the SDK exporter component +must have been dropped. + +Therefore at the basic level of detail, all items of data are counted +when they are done, regardless of the outcome. No additional +attributes are used at this level of detail. + +### Normal level of detail + +At the normal level of detail, an attribute is introduced that +distinguishes whether the item was or was not successful. A boolean +attribute `success` is introduced at this level. + +It is understood that components have limited information about the +success or failure of subsequent pipeline components. While the +veracity of `success=true` may be subject to reasonable doubt, the +`success=false` attribute should be accepted as fact. In a SDK +configuration, the processor's `success=false` may be compared with +the exporter's `success=false` to determine the number of items +dropped by the processor, for example. + +### Detailed metrics + +At the detailed level of metrics, the component includes an additional +`status` to explain its outcomes. These should be interpreted +relative to the value of the `success` attribute, which is always +present when detailed metrics are in use. For the `success=true` +case, components may are recommended to use `reason=ok`. + +Components should use short, descriptive names to explain failure +outcomes. For example, a SDK span processor could use +`reason=queue_full` to annotate dropped spans and +`reason=export_failed` to indicate when the exporter failed. + +Exporter components are encouraged to use system specific details as +the reason. For example, gRPC-based exporter would naturally use the +string form of the gRPC status code as the reason (e.g., +`deadline_exceeded`, `resource_exhausted`, `unimplemented`). + +### Component types and optional names + +Components are uniquely identified using a descriptive `name` +attribute which encompasses at least a short name describing the type +of component being used (e.g., `batch` for the SDK BatchSpanProcessor +or the Collector batch proessor). + +When there is more than one component of a given type active in a +pipeline having the same `domain` and `signal` attributes, the `name` +should include additional information to disambiguate the multiple +instances using the syntax `/`. For example, if there +were two `batch` processors in a collection pipeline (e.g., one for +error spans and one for non-error spans) they might use the names +`batch/error` and `batch/noerror`. + +### Use of scope attributes + +The `domain`, `signal`, and `name` attributes described here are +considered scope attributes. When these metrics are encoded using an +OTLP data representation, the `domain`, `signal`, and `name` +attributes SHOULD be encoded using ther OTLP Scope attributes field. + + +| Attribute | Type | Description | Examples | Requirement Level | Detail level | +|---|---|---|---|---| +| `.domain` | string | Domain of the pipeline with this component | `sdk`, `collector` | Required | Basic | +| `.name` | string | Type and optional name of this component. | `batch`, `batch/errors` | Required | Basic | +| `.signal` | string | Type of signal being described. | `trace`, `logs`, `metrics` | Required | Basic | + ## Metric Instruments -### Metric: `otel.processor.spans` +### Metric: `otel.processor.items` This metric is [required][MetricRequired]. - + | Name | Instrument Type | Unit (UCUM) | Description | | -------- | --------------- | ----------- | -------------- | -| `otel.processor.spans` | Counter | `{span}` | Measures the number of processed Spans. | +| `otel.processor.items` | Counter | `{items}` | Measures the number of processed items (signal specific). | - -| Attribute | Type | Description | Examples | Requirement Level | + + +| Attribute | Type | Description | Examples | Requirement Level | Detail Level | |---|---|---|---|---| -| `processor.dropped` | boolean | Whether the Span was dropped or not. [1] | | Required | -| `processor.type` | string | Type of processor being used. | `BatchSpanProcessor` | Recommended | +| `processor.name` | string | Type and optional name of processor being used. | `batch` | Required | Basic | +| `processor.success` | boolean | Whether the item was successful or not. [1] | true, false | Recommended | Normal | +| `processor.reason` | string | Short string explaining category of success and failure. | `ok`, `queue_full`, `timeout`, `permission_denied` | Recommended | Detailed | -**[1]:** Spans may be dropped if the internal buffer is full. +**[1]:** Consider `success=false` a stronger signal than `success=true` -### Metric: `otel.exporter.spans` +### Metric: `otel.exporter.items` This metric is [required][MetricRequired]. - + | Name | Instrument Type | Unit (UCUM) | Description | | -------- | --------------- | ----------- | -------------- | -| `otel.exporter.spans` | Counter | `{span}` | Measures the number of exported Spans. | +| `otel.exporter.items` | Counter | `{items}` | Measures the number of exported items (signal specific). | - + | Attribute | Type | Description | Examples | Requirement Level | |---|---|---|---|---| -| `exporter.dropped` | boolean | Whether the Span was dropped or not. [1] | | Required | -| `exporter.type` | string | Type of exporter being used. | `OtlpGrpcSpanExporter` | Recommended | +| `exporter.name` | string | Type and optional name of exporter being used. | `otlp/grpc` | Required | Basic | +| `exporter.success` | boolean | Whether the item was successful or not. [1] | true, false | Recommended | Normal | +| `exporter.reason` | string | Short string explaining category of success and failure. | `ok`, `queue_full`, `timeout`, `permission_denied` | Recommended | Detailed | -**[1]:** Spans may be dropped in case of failed ingestion, e.g. network problem or the exported endpoint being down. +**[1]:** Items may be dropped in case of failed ingestion, e.g. network problem or the exported endpoint being down. Consult transport-specific instrumentation for more information about the export requests themselves, including retry attempts. [MetricRequired]: https://github.com/open-telemetry/opentelemetry-specification/blob/v1.22.0/specification/metrics/metric-requirement-level.md#required diff --git a/model/metrics/otel.yaml b/model/metrics/otel.yaml index 624a1ca748..e31ae72d05 100644 --- a/model/metrics/otel.yaml +++ b/model/metrics/otel.yaml @@ -1,38 +1 @@ -groups: - - id: metric.otel.exporter.spans - type: metric - metric_name: otel.exporter.spans - brief: "Measures the number of exported Spans." - instrument: counter - unit: "{span}" - attributes: - - id: exporter.dropped - type: boolean - requirement_level: required - brief: "Whether the Span was dropped or not." - note: > - Spans may be dropped in case of failed ingestion, e.g. network problem - or the exported endpoint being down. - - id: exporter.type - type: string - requirement_level: recommended - brief: "Type of exporter being used." - examples: ["OtlpGrpcSpanExporter"] - - id: metric.otel.processor.spans - type: metric - metric_name: otel.processor.spans - brief: "Measures the number of processed Spans." - instrument: counter - unit: "{span}" - attributes: - - id: processor.dropped - type: boolean - requirement_level: required - brief: "Whether the Span was dropped or not." - note: > - Spans may be dropped if the internal buffer is full. - - id: processor.type - type: string - requirement_level: recommended - brief: "Type of processor being used." - examples: ["BatchSpanProcessor"] +DO NOT REVIEW From d5142b1c567b09d2417fcdceb9425d16ad1f99a8 Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Mon, 23 Oct 2023 17:25:38 -0700 Subject: [PATCH 2/3] remove scope --- docs/otel/export-metrics.md | 24 +++++++----------------- 1 file changed, 7 insertions(+), 17 deletions(-) diff --git a/docs/otel/export-metrics.md b/docs/otel/export-metrics.md index b97ac05e88..43b5c5086e 100644 --- a/docs/otel/export-metrics.md +++ b/docs/otel/export-metrics.md @@ -7,7 +7,7 @@ linkTitle: OpenTelemetry Export **Status**: [Experimental][DocumentStatus] This document describes instruments and attributes for OpenTelemetry -collection-level metrics. Consider the [general metric semantic +export-level metrics. Consider the [general metric semantic conventions](README.md#general-metric-semantic-conventions) when creating instruments not explicitly defined in the specification. @@ -117,20 +117,6 @@ were two `batch` processors in a collection pipeline (e.g., one for error spans and one for non-error spans) they might use the names `batch/error` and `batch/noerror`. -### Use of scope attributes - -The `domain`, `signal`, and `name` attributes described here are -considered scope attributes. When these metrics are encoded using an -OTLP data representation, the `domain`, `signal`, and `name` -attributes SHOULD be encoded using ther OTLP Scope attributes field. - - -| Attribute | Type | Description | Examples | Requirement Level | Detail level | -|---|---|---|---|---| -| `.domain` | string | Domain of the pipeline with this component | `sdk`, `collector` | Required | Basic | -| `.name` | string | Type and optional name of this component. | `batch`, `batch/errors` | Required | Basic | -| `.signal` | string | Type of signal being described. | `trace`, `logs`, `metrics` | Required | Basic | - ## Metric Instruments ### Metric: `otel.processor.items` @@ -147,7 +133,9 @@ This metric is [required][MetricRequired]. | Attribute | Type | Description | Examples | Requirement Level | Detail Level | |---|---|---|---|---| -| `processor.name` | string | Type and optional name of processor being used. | `batch` | Required | Basic | +| `processor.domain` | string | Domain of the pipeline with this exporter | `sdk`, `collector` | Required | Basic | +| `processor.name` | string | Type and optional name of this exporter. | `batch`, `batch/errors` | Required | Basic | +| `processor.signal` | string | Type of signal being described. | `trace`, `logs`, `metrics` | Required | Basic | | `processor.success` | boolean | Whether the item was successful or not. [1] | true, false | Recommended | Normal | | `processor.reason` | string | Short string explaining category of success and failure. | `ok`, `queue_full`, `timeout`, `permission_denied` | Recommended | Detailed | @@ -167,7 +155,9 @@ This metric is [required][MetricRequired]. | Attribute | Type | Description | Examples | Requirement Level | |---|---|---|---|---| -| `exporter.name` | string | Type and optional name of exporter being used. | `otlp/grpc` | Required | Basic | +| `exporter.domain` | string | Domain of the pipeline with this exporter | `sdk`, `collector` | Required | Basic | +| `exporter.name` | string | Type and optional name of this exporter. | `otlp/grpc`, `otlp/errors` | Required | Basic | +| `exporter.signal` | string | Type of signal being described. | `trace`, `logs`, `metrics` | Required | Basic | | `exporter.success` | boolean | Whether the item was successful or not. [1] | true, false | Recommended | Normal | | `exporter.reason` | string | Short string explaining category of success and failure. | `ok`, `queue_full`, `timeout`, `permission_denied` | Recommended | Detailed | From c0c3c55f8f4853ae1c9fc30aa39f55a804f75029 Mon Sep 17 00:00:00 2001 From: Joshua MacDonald Date: Mon, 23 Oct 2023 17:31:32 -0700 Subject: [PATCH 3/3] format --- docs/otel/export-metrics.md | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/docs/otel/export-metrics.md b/docs/otel/export-metrics.md index 43b5c5086e..7223aaf6c1 100644 --- a/docs/otel/export-metrics.md +++ b/docs/otel/export-metrics.md @@ -131,13 +131,13 @@ This metric is [required][MetricRequired]. -| Attribute | Type | Description | Examples | Requirement Level | Detail Level | -|---|---|---|---|---| -| `processor.domain` | string | Domain of the pipeline with this exporter | `sdk`, `collector` | Required | Basic | -| `processor.name` | string | Type and optional name of this exporter. | `batch`, `batch/errors` | Required | Basic | -| `processor.signal` | string | Type of signal being described. | `trace`, `logs`, `metrics` | Required | Basic | -| `processor.success` | boolean | Whether the item was successful or not. [1] | true, false | Recommended | Normal | -| `processor.reason` | string | Short string explaining category of success and failure. | `ok`, `queue_full`, `timeout`, `permission_denied` | Recommended | Detailed | +| Attribute | Type | Description | Examples | Requirement Level | +|---------------------|---------|----------------------------------------------------------|----------------------------------------------------|-------------------| +| `processor.domain` | string | Domain of the pipeline with this exporter | `sdk`, `collector` | Required | +| `processor.name` | string | Type and optional name of this exporter. | `batch`, `batch/errors` | Required | +| `processor.signal` | string | Type of signal being described. | `trace`, `logs`, `metrics` | Required | +| `processor.success` | boolean | Whether the item was successful or not. [1] | true, false | Recommended | +| `processor.reason` | string | Short string explaining category of success and failure. | `ok`, `queue_full`, `timeout`, `permission_denied` | Detailed | **[1]:** Consider `success=false` a stronger signal than `success=true` @@ -153,13 +153,13 @@ This metric is [required][MetricRequired]. -| Attribute | Type | Description | Examples | Requirement Level | -|---|---|---|---|---| -| `exporter.domain` | string | Domain of the pipeline with this exporter | `sdk`, `collector` | Required | Basic | -| `exporter.name` | string | Type and optional name of this exporter. | `otlp/grpc`, `otlp/errors` | Required | Basic | -| `exporter.signal` | string | Type of signal being described. | `trace`, `logs`, `metrics` | Required | Basic | -| `exporter.success` | boolean | Whether the item was successful or not. [1] | true, false | Recommended | Normal | -| `exporter.reason` | string | Short string explaining category of success and failure. | `ok`, `queue_full`, `timeout`, `permission_denied` | Recommended | Detailed | +| Attribute | Type | Description | Examples | Requirement Level | +|--------------------|---------|----------------------------------------------------------|----------------------------------------------------|-------------------| +| `exporter.domain` | string | Domain of the pipeline with this exporter | `sdk`, `collector` | Required | +| `exporter.name` | string | Type and optional name of this exporter. | `otlp/http`, `otlp/grpc` | Required | +| `exporter.signal` | string | Type of signal being described. | `trace`, `logs`, `metrics` | Required | +| `exporter.success` | boolean | Whether the item was successful or not. [1] | true, false | Recommended | +| `exporter.reason` | string | Short string explaining category of success and failure. | `ok`, `queue_full`, `timeout`, `permission_denied` | Detailed | **[1]:** Items may be dropped in case of failed ingestion, e.g. network problem or the exported endpoint being down. Consult transport-specific instrumentation for more information about the export requests themselves, including retry attempts.