Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Context leak if several TracerProvider instances are used #1932

Closed
2 tasks done
Flarna opened this issue Feb 16, 2021 · 8 comments
Closed
2 tasks done

Context leak if several TracerProvider instances are used #1932

Flarna opened this issue Feb 16, 2021 · 8 comments
Labels
Discussion Issue or PR that needs/is extended discussion. stale

Comments

@Flarna
Copy link
Member

Flarna commented Feb 16, 2021

There are several places in our codebase where context.active() is used to fetch the current context from global ContextManager.

I think this may be problematic as this easily results in "leaking" spans from one export chain into another. Following code illustrates this:

import * as assert from 'assert';

import { NodeTracerProvider } from '@opentelemetry/node';
import { InMemorySpanExporter, SimpleSpanProcessor } from '@opentelemetry/tracing';
import { context, setSpan } from '@opentelemetry/api';

const globalExporter = new InMemorySpanExporter();
const globalTracerProvider = new NodeTracerProvider();
globalTracerProvider.addSpanProcessor(new SimpleSpanProcessor(globalExporter));
globalTracerProvider.register();
const globalTracer = globalTracerProvider.getTracer('global');

const otherExporter = new InMemorySpanExporter();
const otherTracerProvider = new NodeTracerProvider();
otherTracerProvider.addSpanProcessor(new SimpleSpanProcessor(otherExporter));
const otherTracer = otherTracerProvider.getTracer('other');

describe('context lead', () => {
  beforeEach(() => {
    otherExporter.reset();
    globalExporter.reset();
  });

  it('should create independent trace if global root span is present', () => {
    const rootSpan = globalTracer.startSpan('global');
    context.with(setSpan(context.active(), rootSpan), () => {
      // assume this is in some other code part but on same callstack
      // startSpan here uses context.active() => using global context manager and therefore rootSpan as parent
      otherTracer.startSpan('other').end();
    });
    rootSpan.end();

    assert.strictEqual(globalExporter.getFinishedSpans().length, 1);
    assert.strictEqual(otherExporter.getFinishedSpans().length, 1);

    assert.strictEqual(globalExporter.getFinishedSpans()[0].parentSpanId, undefined);
    // asserts => other exports a span with parent which will be never exported via other export chain
    assert.strictEqual(otherExporter.getFinishedSpans()[0].parentSpanId, undefined);
  });
});

There two TracerProvider instances are used resulting in two independent export chains. But actually the span exported via "other" export chain refers to the "global" span as parent.
Reason is that startSpan() uses context.active() therefore the global span becomes visible there.

Only if other creates its own context manager and uses it everywhere they are independent.

Using it everywhere is actually hard as e.g. BatchSpanProcessor use global context internally to suppress their exports.

There are several places in our samples/docs/API which illustrate that using more and non global TracerProvider is possible (e.g. InstrumentationAbstract.setTracerProvider(),...)
But to me it looks like this will result in unexpected behavior if this is actually done by users.

Propagator API seems to be not effected as it requires to pass in a context and global context is not used automatically.

I'm not really sure how to tackle this.

  • This only affects the JavaScript OpenTelemetry library
  • This may affect other libraries, but I would like to get opinions here first
@Flarna Flarna added the Discussion Issue or PR that needs/is extended discussion. label Feb 16, 2021
@vmarchaud
Copy link
Member

I'm not sure this is an actual problem end user will face, having different TracerProvider is theorically possible and is required by the spec (i believe), however there were questions at the time if this was an actual use case or just "just-in-case-we-might-need-that-later".

@obecny
Copy link
Member

obecny commented Feb 16, 2021

I'm wondering if the same would happen with web tracer provider and zone context, what context provider are you using in node (we have 2), does it happen for both ?
If so perhaps removing parent from tracer wasn't the best option as then it would not have problem with parent in general.

@Flarna
Copy link
Member Author

Flarna commented Feb 16, 2021

That's independent of Web/Node and which context manager is used.
The root cause is that every Tracer uses the context seen by global context manager on default.

So in short a user of a non global TracerProvider has to pass a context to it's Tracers to avoid this. This can be done either manual by passing around context objects (like in go) or by always using it's own ContextManager.

Maybe it's even worse for ZoneContextManager. I'm not sure but it seems it uses a string key so it could be that one instance of ZoneContextManager leaks into another.
The AsyncHooks/AsyncLocal based context managers should be isolated.

@dyladan
Copy link
Member

dyladan commented Feb 16, 2021

that would also make all the context management methods be on the tracerprovider instead of global correct? after 1.0 it will not be possible to make changes like this so we will have to either document this as a known issue or fix it now.

@Flarna
Copy link
Member Author

Flarna commented Feb 16, 2021

that would also make all the context management methods be on the tracerprovider instead of global correct?

Not really as a TracerProvider has no ContextManager. The ContextManager is optionally created/added during TracerProvider.register().
As far as I know Context should be also usable for Metrics - independent of Trace.

I guess it would be needed to remove any defaulting to global context manager.

I think the majority of users will be not effected as they will use one TracerProvider and that one will be installed global.

For the remaining users doc should help. Clearly state that they are responsible to pass context manually or use their ContextManager everywhere. There might be some gaps left related to suppressInstrumentation() withing SpanProcessor.
Besides that I assume that just using some plugin/instrumentation available in the wild and configure a non global TracerProvider may result in conflicts.

@dyladan
Copy link
Member

dyladan commented Feb 16, 2021

I think it might be better to drop the multiple tracerprovider idea entirely. it is not required by the spec anymore and nothing in our docs says its supported. we can explicitly document it as unsupported

Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@github-actions github-actions bot added the stale label Oct 21, 2024
Copy link

github-actions bot commented Jan 6, 2025

This issue was closed because it has been stale for 14 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Discussion Issue or PR that needs/is extended discussion. stale
Projects
None yet
Development

No branches or pull requests

4 participants