Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JaegerExporter caches sender, which locks process service.name for all Spans #2606

Closed
bennbollay opened this issue Nov 9, 2021 · 13 comments
Closed
Labels

Comments

@bennbollay
Copy link

The call to getSender caches the results of the sender generation, but the sender itself caches the service.name:

  private _getSender(span: jaegerTypes.ThriftSpan): typeof jaegerTypes.UDPSender {
    if (this._sender) {
      return this._sender;
    }

    const sender = this._localConfig.endpoint ? new jaegerTypes.HTTPSender(this._localConfig) : new jaegerTypes.UDPSender(this._localConfig);

    if (sender._client instanceof Socket) {
      // unref socket to prevent it from keeping the process running
      sender._client.unref();
    }

    const serviceNameTag = span.tags.find(t => t.key === SemanticResourceAttributes.SERVICE_NAME)
    const serviceName = serviceNameTag?.vStr || 'unknown_service';

    sender.setProcess({
      serviceName,
      tags: jaegerTypes.ThriftUtils.getThriftTags(this._localConfig.tags || []),
    });

    this._sender = sender;
    return sender;
  }

This produces much surprise when multiple spans with different SERVICE_NAME tags specified on them are supplied to the same exporter instance.

This can be worked around by sending spans through instances that match the service.name via caching, or generating a new entry on execution.

@bennbollay bennbollay added the bug Something isn't working label Nov 9, 2021
@bennbollay bennbollay changed the title JaegerExporter caches HTTPSender, which locks process service.name for duration JaegerExporter caches sender, which locks process service.name for all Spans Nov 9, 2021
@aabmass
Copy link
Member

aabmass commented Nov 9, 2021

@bennbollay the exporter is operating under the assumption that resource is immutable. It assumes the SemanticResourceAttributes.SERVICE_NAME it sees here is not going to change for the lifetime of the SDK and the exporter instance, which is a fair assumption. Based on our slack discussion I think you are trying to use the exporter directly without the SDK?

@dyladan
Copy link
Member

dyladan commented Nov 9, 2021

Resource is immutable according to the specification.

@bennbollay
Copy link
Author

@aabmass I'm calling the exporter.export directly, yes.

As a developer, my expectation was that parameters specified on the span are expected to be ephemeral; parameters that are constant would have been supplied during construction, rather than intuited on first span and then persisted forever more.

This seems like a micro-optimization and caused several days worth of hunting around before I was able to understand the system deeply enough to determine what and where the parameters I was explicitly setting on each span were getting ignored.

@dyladan
Copy link
Member

dyladan commented Nov 9, 2021

it's not meant to be a performance optimization for the exporter. It's meant to ensure identity doesn't change between spans in order to not confuse backends.

That said, the immutability should be per-span-processor not per-exporter so it should be fine to remove the caching from the exporter itself. Would you like to make a PR for that or should I mark this up for grabs?

@dyladan dyladan removed the bug Something isn't working label Nov 9, 2021
@dyladan
Copy link
Member

dyladan commented Nov 9, 2021

Removing the bug label since this is working as designed, it's just a design we don't want :)

@bennbollay
Copy link
Author

ensure identity doesn't change between spans in order to not confuse backends.

Are there backends that expect all of the spans in a batched request to have the same identity? Is that part of the protocol specification? Seems like it would break aggregators or forwarders, if that was the case...

I'm absolutely not informed enough about this space to have a strong opinion, but this sounds like it might be a "good idea to be defensive" that ended up being a "unnecessarily opinionated" library decision.

I'm happy to throw together a PR for this, but will need some hand-holding (here or on Slack is fine) so that I'm making the right change in the right place.

@dyladan
Copy link
Member

dyladan commented Nov 9, 2021

Some exporters (like otlp) treat the resource as a separate object from the span and spans are attached to the resource. Backends can then determine that two spans came from the same tracer if they have the same resource. If the resource is ever mutated, then that backend would not know the spans are from the same tracer. As far as all of the spans in a batch having the same resource I don't think that's a requirement.

The change that should be made is just that the exporter shouldn't cache the resource, but should instead use whatever resource is attached to the span. The immutability is already enforced elsewhere in the SDK.

@aabmass
Copy link
Member

aabmass commented Nov 16, 2021

Correct me if I'm wrong, but the sender looks like a "client" resource which keeps open a thrift connection. You would only want to create one for a given exporter, and not close the client every time, right? It isn't clear to me if you can update the service name after creating a client, and creating a new client for every request could be expensive.

Based on our discussion of your use case, you may run into more surprises. Keep in mind these exporters are meant for use in the OTel SDK and can make certain assumptions based on this. For example they are not safe for concurrent use.

@github-actions
Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@github-actions github-actions bot added the stale label Jan 17, 2022
@github-actions
Copy link

github-actions bot commented Feb 7, 2022

This issue was closed because it has been stale for 14 days with no activity.

@github-actions github-actions bot closed this as completed Feb 7, 2022
@dyladan
Copy link
Member

dyladan commented Feb 7, 2022

I think this is still an issue

@dyladan dyladan reopened this Feb 7, 2022
@github-actions github-actions bot removed the stale label Feb 14, 2022
@github-actions
Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@github-actions github-actions bot added the stale label Apr 18, 2022
@github-actions
Copy link

github-actions bot commented May 2, 2022

This issue was closed because it has been stale for 14 days with no activity.

@github-actions github-actions bot closed this as completed May 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants