Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/elasticsearch] Ability to specify the document ID for logs #36882

Closed
mauri870 opened this issue Dec 18, 2024 · 2 comments · Fixed by #37065
Closed

[exporter/elasticsearch] Ability to specify the document ID for logs #36882

mauri870 opened this issue Dec 18, 2024 · 2 comments · Fixed by #37065

Comments

@mauri870
Copy link
Contributor

mauri870 commented Dec 18, 2024

Component(s)

exporter/elasticsearch

Is your feature request related to a problem? Please describe.

In Beats we are using the bodymap mapping mode to control the final document structure for logs. There is one thing we are unable to control with this approach, the final document ID. We use to control this ID for means of deduplicating messages as well as setting the id from a log field. For context, see https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-deduplication.html.

Describe the solution you'd like

We would like to have a way to tell the exporter to use a specific ID as the document identifier. The current implementation does not allow this and defaults to an autogenerated ID here. This might be a broader question for the collector in general, as we would like such functionality to be available from all the exporters.

Describe alternatives you've considered

One alternative is to consider a specific field in the log message body as the document id. For example in beats we use @metatada._id as the ID field, this @metadata field is then stripped out at our output layer (equivalent to an exporter in OTel) so we can use this information to set the ID properly in the final document.

Looking at the Logs model for OTel this would be equivalent to an Attribute. I imagine deduplication and the ability to set an identifier for the exported data is an interesting functionality that perhaps shoudn't be constrained to the elasticsearchexporter so I'm open to more general suggestions.

Additional context

No response

@mauri870 mauri870 added enhancement New feature or request needs triage New item requiring triage labels Dec 18, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@mauri870 mauri870 changed the title [exporter/elasticsearch] Ability to specify the document ID [exporter/elasticsearch] Ability to specify the document ID for logs Dec 18, 2024
@VihasMakwana
Copy link
Contributor

VihasMakwana commented Dec 18, 2024

This sounds like a valid enhancement to me.
We can probably have something like log_dynamic_index, to allow setting _id from attributes.

Let's see what the code owners have to say about this.

@VihasMakwana VihasMakwana added waiting-for-code-owners and removed needs triage New item requiring triage labels Dec 18, 2024
chengchuanpeng pushed a commit to chengchuanpeng/opentelemetry-collector-contrib that referenced this issue Jan 26, 2025
…en-telemetry#37065)

#### Description

This PR adds a new config option `logs_dynamic_id` that when set to true
reads the `elasticsearch.document_id` attribute from each log record and
uses it as the final document id in Elasticsearch. This is only
implemented for logs but I can open subsequent PRs supporting metrics
and traces akin to the `*_dynamic_index` options.

Fixes open-telemetry#36882

#### Testing

Added tests to verify that the document ID attribute can be read from
the log record and that the _id is properly forwarded to Elasticsearch.
Also asserted that when there is no doc id attribute the current
behavior is retained.

#### Documentation

Updated the readme to mention the new `logs_dynamic_id` config option.

---------

Co-authored-by: Carson Ip <carsonip@users.noreply.github.com>
Co-authored-by: Christos Markou <chrismarkou92@gmail.com>
andrzej-stencel pushed a commit that referenced this issue Jan 30, 2025
#### Description

Adds docs for setting the document id dynamically.

#### Link to tracking issue

Related
#36882
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants