From 9ae2ce5cba12939d4ed824b98b1311804fd82055 Mon Sep 17 00:00:00 2001 From: Carson Ip Date: Fri, 10 Jan 2025 19:45:31 +0000 Subject: [PATCH 1/5] [chore][exporter/elasticsearch] Add more detail to version_conflict_engine_exception known issue --- exporter/elasticsearchexporter/README.md | 31 ++++++++++++++++++++++-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/exporter/elasticsearchexporter/README.md b/exporter/elasticsearchexporter/README.md index 13ecfa53507d..fbd186afb847 100644 --- a/exporter/elasticsearchexporter/README.md +++ b/exporter/elasticsearchexporter/README.md @@ -357,8 +357,35 @@ In case the record contains `timestamp`, this value is used. Otherwise, the `obs ### version_conflict_engine_exception -When sending high traffic of metrics to a TSDB metrics data stream, e.g. using OTel mapping mode to a 8.16 Elasticsearch, it is possible to get error logs "failed to index document" with `error.type` "version_conflict_engine_exception" and `error.reason` containing "version conflict, document already exists". It is due to Elasticsearch grouping metrics with the same dimensions, whether it is the same or different metric name, using `@timestamp` in milliseconds precision as opposed to nanoseconds in elasticsearchexporter. +Symptom: elasticsearchexporter logs an error "failed to index document" with `error.type` "version_conflict_engine_exception" and `error.reason` containing "version conflict, document already exists". + +This happens when the target data stream is a TSDB metrics data stream (e.g. using OTel mapping mode to a 8.16 Elasticsearch). See the following scenarios. + +1. When sending different metrics with the same dimension (mostly made up of resource attributes, scope attributes, attributes), +a `version_conflict_engine_exception` is returned by Elasticsearch when these metrics are not grouped into the same document. +It also means that they have to be in the same batch in the exporter, as metric grouping is done per-batch in elasticsearchexporter. +To work around the issue, use a transform processor to ensure different metrics to never share the same set of dimensions. + +```yaml +processors: + transform/unique_dimensions: + metric_statements: + - context: datapoint + statements: + - set(attributes["metric_name"], metric.name) +``` + +2. If the problem persists, the issue may be caused by metrics with data points in the same millisecond but not the same nanosecond, as metric grouping is done in nanoseconds but Elasticsearch checks for duplicates in milliseconds. This will be fixed in a future version of Elasticsearch. A possible workaround would be to use a transform processor to truncate the timestamp, but this will cause duplicate data to be dropped silently. -However, if `@timestamp` precision is not the problem, check your metrics pipeline setup for misconfiguration that causes an actual violation of the [single writer principle](https://opentelemetry.io/docs/specs/otel/metrics/data-model/#single-writer). \ No newline at end of file +```yaml +processors: + transform/truncate_timestamp: + metric_statements: + - context: datapoint + statements: + - set(time, TruncateTime(time, Duration("1ms"))) +``` + +3. If all of the above do not apply, check your metrics pipeline setup for misconfiguration that causes an actual violation of the [single writer principle](https://opentelemetry.io/docs/specs/otel/metrics/data-model/#single-writer). \ No newline at end of file From 840d343ae263eeb9023db72c570e85b2bc9b23ff Mon Sep 17 00:00:00 2001 From: Carson Ip Date: Fri, 10 Jan 2025 19:49:31 +0000 Subject: [PATCH 2/5] Update description --- exporter/elasticsearchexporter/README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/exporter/elasticsearchexporter/README.md b/exporter/elasticsearchexporter/README.md index fbd186afb847..4eb6d670f256 100644 --- a/exporter/elasticsearchexporter/README.md +++ b/exporter/elasticsearchexporter/README.md @@ -359,12 +359,12 @@ In case the record contains `timestamp`, this value is used. Otherwise, the `obs Symptom: elasticsearchexporter logs an error "failed to index document" with `error.type` "version_conflict_engine_exception" and `error.reason` containing "version conflict, document already exists". -This happens when the target data stream is a TSDB metrics data stream (e.g. using OTel mapping mode to a 8.16 Elasticsearch). See the following scenarios. +This happens when the target data stream is a TSDB metrics data stream (e.g. using OTel mapping mode sending to a 8.16+ Elasticsearch). See the following scenarios. 1. When sending different metrics with the same dimension (mostly made up of resource attributes, scope attributes, attributes), a `version_conflict_engine_exception` is returned by Elasticsearch when these metrics are not grouped into the same document. It also means that they have to be in the same batch in the exporter, as metric grouping is done per-batch in elasticsearchexporter. -To work around the issue, use a transform processor to ensure different metrics to never share the same set of dimensions. +To work around the issue, use a transform processor to ensure different metrics to never share the same set of dimensions. This is done at the expense of storage efficiency. ```yaml processors: @@ -375,9 +375,9 @@ processors: - set(attributes["metric_name"], metric.name) ``` -2. If the problem persists, the issue may be caused by metrics with data points in the same millisecond but not the same nanosecond, as metric grouping is done in nanoseconds but Elasticsearch checks for duplicates in milliseconds. +2. If the problem persists, the error may be caused by metrics with data points in the same millisecond but not the same nanosecond, as metric grouping is done in nanoseconds but Elasticsearch checks for duplicates in milliseconds. -This will be fixed in a future version of Elasticsearch. A possible workaround would be to use a transform processor to truncate the timestamp, but this will cause duplicate data to be dropped silently. +This will be fixed in a future version of Elasticsearch. To work around the issue, use a transform processor to truncate the timestamp, but this will cause duplicate data in the same millisecond to be dropped silently. ```yaml processors: From 42317afded9a635eaec393bc8cdeebd0f1daef4e Mon Sep 17 00:00:00 2001 From: Carson Ip Date: Fri, 10 Jan 2025 19:56:34 +0000 Subject: [PATCH 3/5] Format --- exporter/elasticsearchexporter/README.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/exporter/elasticsearchexporter/README.md b/exporter/elasticsearchexporter/README.md index 4eb6d670f256..ca503591e484 100644 --- a/exporter/elasticsearchexporter/README.md +++ b/exporter/elasticsearchexporter/README.md @@ -362,9 +362,9 @@ Symptom: elasticsearchexporter logs an error "failed to index document" with `er This happens when the target data stream is a TSDB metrics data stream (e.g. using OTel mapping mode sending to a 8.16+ Elasticsearch). See the following scenarios. 1. When sending different metrics with the same dimension (mostly made up of resource attributes, scope attributes, attributes), -a `version_conflict_engine_exception` is returned by Elasticsearch when these metrics are not grouped into the same document. +`version_conflict_engine_exception` is returned by Elasticsearch when these metrics are not grouped into the same document. It also means that they have to be in the same batch in the exporter, as metric grouping is done per-batch in elasticsearchexporter. -To work around the issue, use a transform processor to ensure different metrics to never share the same set of dimensions. This is done at the expense of storage efficiency. +To work around the issue, use a [transform processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/transformprocessor/README.md) to ensure different metrics to never share the same set of dimensions. This is done at the expense of storage efficiency. ```yaml processors: @@ -375,9 +375,9 @@ processors: - set(attributes["metric_name"], metric.name) ``` -2. If the problem persists, the error may be caused by metrics with data points in the same millisecond but not the same nanosecond, as metric grouping is done in nanoseconds but Elasticsearch checks for duplicates in milliseconds. - -This will be fixed in a future version of Elasticsearch. To work around the issue, use a transform processor to truncate the timestamp, but this will cause duplicate data in the same millisecond to be dropped silently. +2. If the problem persists, the error may be caused by metrics with data points with only nanosecond differences, as metric grouping is done in nanoseconds as opposed to milliseconds in while Elasticsearch checks for duplicates in milliseconds. +This will be fixed in a future version of Elasticsearch. To work around the issue, use a [transform processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/transformprocessor/README.md) to truncate the timestamp, +but this will cause duplicate data in the same millisecond to be dropped silently. ```yaml processors: From 373f6311553d1c7408fcd1554c7632165662fb41 Mon Sep 17 00:00:00 2001 From: Carson Ip Date: Wed, 15 Jan 2025 10:20:28 +0000 Subject: [PATCH 4/5] Remove suggestion to truncate timestamp --- exporter/elasticsearchexporter/README.md | 15 +-------------- 1 file changed, 1 insertion(+), 14 deletions(-) diff --git a/exporter/elasticsearchexporter/README.md b/exporter/elasticsearchexporter/README.md index 5e10e9ea0d91..d2ce4e618a94 100644 --- a/exporter/elasticsearchexporter/README.md +++ b/exporter/elasticsearchexporter/README.md @@ -372,17 +372,4 @@ processors: - set(attributes["metric_name"], metric.name) ``` -2. If the problem persists, the error may be caused by metrics with data points with only nanosecond differences, as metric grouping is done in nanoseconds as opposed to milliseconds in while Elasticsearch checks for duplicates in milliseconds. -This will be fixed in a future version of Elasticsearch. To work around the issue, use a [transform processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/transformprocessor/README.md) to truncate the timestamp, -but this will cause duplicate data in the same millisecond to be dropped silently. - -```yaml -processors: - transform/truncate_timestamp: - metric_statements: - - context: datapoint - statements: - - set(time, TruncateTime(time, Duration("1ms"))) -``` - -3. If all of the above do not apply, check your metrics pipeline setup for misconfiguration that causes an actual violation of the [single writer principle](https://opentelemetry.io/docs/specs/otel/metrics/data-model/#single-writer). +2. Otherwise, check your metrics pipeline setup for misconfiguration that causes an actual violation of the [single writer principle](https://opentelemetry.io/docs/specs/otel/metrics/data-model/#single-writer). From a5665f72b78c73cd4d58241e022b4608aee90081 Mon Sep 17 00:00:00 2001 From: Carson Ip Date: Wed, 15 Jan 2025 13:41:35 +0000 Subject: [PATCH 5/5] Mention known issue --- exporter/elasticsearchexporter/README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/exporter/elasticsearchexporter/README.md b/exporter/elasticsearchexporter/README.md index d2ce4e618a94..f073cea60cf1 100644 --- a/exporter/elasticsearchexporter/README.md +++ b/exporter/elasticsearchexporter/README.md @@ -362,6 +362,7 @@ This happens when the target data stream is a TSDB metrics data stream (e.g. usi `version_conflict_engine_exception` is returned by Elasticsearch when these metrics are not grouped into the same document. It also means that they have to be in the same batch in the exporter, as metric grouping is done per-batch in elasticsearchexporter. To work around the issue, use a [transform processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/transformprocessor/README.md) to ensure different metrics to never share the same set of dimensions. This is done at the expense of storage efficiency. +This workaround will no longer be necessary once the limitation is lifted in Elasticsearch (see [issue](https://github.com/elastic/elasticsearch/issues/99123)). ```yaml processors: