Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The headers_setter extension supports values from attributes #34257

Open
danielnegreiros opened this issue Jul 25, 2024 · 8 comments
Open

The headers_setter extension supports values from attributes #34257

danielnegreiros opened this issue Jul 25, 2024 · 8 comments
Labels

Comments

@danielnegreiros
Copy link

Is your feature request related to a problem? Please describe.

No, it's not a technical issue, more of a business value.

Describe the solution you'd like

It involves the Headers Setter extension providing the capability to fetch data not only from HTTP headers but also from attributes. We have two options to set the content, value or from_context.

extensions:
  headers_setter:
    headers:
      - action: insert
        key: X-Scope-OrgID
        from_context: tenant_id

According to the documentation:
from_context: The header value is looked up from the request metadata, such as HTTP headers.

But when using fluentforward as a receiver we will not have HTTP headers, only logs attributes.

    receivers:
      fluentforward:
          endpoint: 0.0.0.0:8006

It would be great to have an option like this to get the value from attributes.

extensions:
  headers_setter:
    headers:
      - action: insert
        key: X-Scope-OrgID
        from_attribute: tenant_id

Otherwise, It's not possible to use fluentforward or another non-HTTP receiver as multi-tenancy.

Describe alternatives you've considered

I have considered using otlp/http header configuration, it works with hardcoded values, but also cannot get values from attributes.

otlphttp:
  endpoint: "https://1.2.3.4:1234"
  headers:
    X-Scope-OrgID: "somevalue"
@TylerHelmuth TylerHelmuth transferred this issue from open-telemetry/opentelemetry-collector Jul 25, 2024
Copy link
Contributor

Pinging code owners for extension/headerssetter: @jpkrohling. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@jpkrohling
Copy link
Member

The mismatch here is that there's one connection (and one context), containing potentially multiple data points. So, while one log entry had a tenant_id set to "acme", another log entry could have "ecorp" as the tenant_id. Which one would be used for the single header that is being set on the single connection going out? Should they be sent as different requests? This would then fall out of the scope for the header setter. Or should we just assume that there's only one tenant_id for the whole batch? In that case, we risk leaking data from one tenant to another when the base assumption stops holding.

@danielnegreiros
Copy link
Author

That is the point, to have the ability to use similar goals from from_context and set headers dynamically.
So whether the log attribute is acme or ecorp it will be properly set in the header.

While from_context gives the capability to set X-Scope-OrgID from a HTTP header value.
Something like from_attribute would give the same capability to set X-Scope-OrgID (or any other header) from a log attribute.

The reason this strategy would be a complement to from_context is that some receivers such as fluentforward are not HTTP-based so it doesn't have headers and we cannot use from_context.

@jpkrohling
Copy link
Member

I think we might be looking at the problem from different perspectives: the incoming request can only have one tenant for the whole payload if the tenant information is part of the request (context). However, each request can contain multiple ResourceLogs, and each ResourceLog can have its own resource attribute with the tenant information. So, each incoming request can have data belonging to multiple tenants.

When preparing an outgoing request, we can only set one tenant header like X-Scope-OrgID.

What we could do as a solution is to use the groupbyattributes processor to ensure that one incoming request results in multiple outgoing requests, one per tenant. Then, we'd need a processor that takes pipeline data (the tenant information from the log record) and place it in the context, so that it can be used by the header setter.

Note that extensions do not have access to the pipeline data.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Oct 15, 2024
@siegenthalerroger
Copy link

I've run across this issue with requiring a header to be set from value that's extracted by the k8sattribute processor and stored as an attribute. I think what's missing is either for the headers_setter extension to have a from_attribute option or for there to be a processor capable of modifying the context at all. I think this is what you meant @jpkrohling right?

A (quick) google led me to a project that has actually implemented a contextprocessor that is made exactly because of this usecase: https://github.com/springernature/o11y-otel-contextprocessor

I think it would be great if this could be upstreamed into contrib and bundled.

(or am I missing something and there's a different way to get the value of a k8s label into a header 😅 )

@github-actions github-actions bot removed the Stale label Nov 6, 2024
Copy link
Contributor

github-actions bot commented Jan 6, 2025

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label Jan 6, 2025
@siegenthalerroger
Copy link

This is still relevant. I actually found a relevant issue regarding the contextprocessor changes and Springer Nature's previous work: #34649

As recommended by the bot, pinging the codeowners of the headerssetters extension: @open-telemetry/collector-contrib-approvers

@github-actions github-actions bot removed the Stale label Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants