Skip to content
This repository has been archived by the owner on Nov 21, 2022. It is now read-only.

Injecting traceparent header even if no listener registered #18

Open
pardahlman opened this issue Aug 15, 2022 · 9 comments
Open

Injecting traceparent header even if no listener registered #18

pardahlman opened this issue Aug 15, 2022 · 9 comments

Comments

@pardahlman
Copy link

We're collecting trace data using OpenTelemetry in our system that uses NServiceBus as a messaging platform. Only a small percentage of all activities are sampled, basically just enough for us to get an understanding of messages flows and bottlenecks in the system.

All log entries produced by the distributed system are enriched with trace id and span id. It is a very powerful tool to filter out logs based on this.

When trying out this package, I noticed that no header information is propagated over NServiceBus if the request isn't sampled. I believe this is due to the call to ActivitySource.StartActivity that returns null if the the sampling result is ActivitySamplingResult.None. This, in turn, is decided by OpenTelemetry (we use ParentBasedSampler, fallbacking to AlwaysOffSamplerto make sure that we take the sampling decision at the edge). Since the activity stored in ICurrentActivityis null, no header information is injected in OutgoingPhysicalMessageDiagnostics.

So, here's the problem: Even though we only sample a small percentage of all calls in open-telemetry, it is still desirable for us to propagate tracing headers over NServiceBus to enrich all log entries. As things work now, the trace "chain" (typically coming over HTTP or other transports) is broken on NServiceBus publish/send.

Looking forward hearing your thoughts on this!

@jbogard
Copy link
Owner

jbogard commented Aug 15, 2022

Interesting, I was just copying the behavior I found in .NET Core. Here's the code for HttpClient:

https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Http/src/System/Net/Http/DiagnosticsHandler.cs#L56

Do you see any major differences in why theirs would always inject the header? From the code it looks like it only includes the header if an activity is created/started.

@pardahlman
Copy link
Author

Without being familiar with changes of DiagnosticsHandler reflected in the main branch, I believe that this piece of code from latest release (tag 6.0.8) is the reason why the headers are propagated over HTTP requests using HttpClient. That particular code was removed in dotnet/runtime@d8c0170 as they started using the ActivitySource API, however it would surprise me if the propagation behavior has been changed.

Anyways, looking closer at the link you provided I see a fallback check based on Activity.Current that I don't see in OutgoingLogicalMessageDiagnostics: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Http/src/System/Net/Http/DiagnosticsHandler.cs#L64-L70

In this package, the fallback to Activity.Current happens in OutgoingPhysicalMessageDiagnostics https://github.com/jbogard/NServiceBus.Extensions.Diagnostics/blob/master/src/NServiceBus.Extensions.Diagnostics/OutgoingPhysicalMessageDiagnostics.cs#L29-L31... I'm not too familiar with threading an control flow in NServiceBus, but we're persisting outgoing messages in the outbox and perhaps they are processed in a background thread with no current activity? (Note this is a loose theory that I haven't had the time to verify)

@jbogard
Copy link
Owner

jbogard commented Aug 17, 2022

Reviewing their code again, there's a LOT that changed in those classes - both for incoming and outgoing requests - for .NET 6. I'll need to revisit the logic inside in the handlers.

@pardahlman
Copy link
Author

I've dug some more in this and made another observations:

There is a similar issue with the creation of activity in IncomingPhysicalMessageDiagnostics. Even with parentId present, no activity is created if no listener is registered to the ActivitySource. It looks to me that this class lacks the checks that ASP.NET Core does for incoming requests. (Note StartActivity calls CreateActivity so it is completely analogue to your code).

My suggestions are:

  1. Instead of having a fallback to Activity.Current in OutgoingPhysicalMessageDiagnostics, we should take inspiration of what DiagnosticsHandler does when the activity is null and implement something similar in OutgoingLogicalMessageDiagnostics.
  2. Update the logic in IncomingPhysicalMessageDiagnostics to mimic the behavior if ASP.NET Core.

I hope that I'll find the time to create a PR with suggested changes so that we can discuss this issue based on those.

@pardahlman
Copy link
Author

Let me know if there is anything I can do clarify my argumentation or help out in assessing how reasonable the proposed change is!

@pardahlman
Copy link
Author

Have you given this any though? I understand if addressing this issue is not a priority, but it would be helpful to get an indication if it is going to be addressed at all or if we should find alternative approaches to propagating activities over NServiceBus.

I believe that the changes made in #20 is still relevant and I can invest some more time in fixing the failing test if you give me some points as to how you wish to solve it.

@jbogard
Copy link
Owner

jbogard commented Oct 25, 2022

The next version of NServiceBus will have OTel and Activity etc out-of-the-box, so I'm going to target my next release of this package to remove any features that they have built-in. I would suggest reviewing their implementation to see if there's any similar issues.

@pardahlman
Copy link
Author

Thanks for getting back to me. I didn't realize that support for open telemetry and activity propagation was coming to NServiceBus.

@jbogard
Copy link
Owner

jbogard commented Oct 26, 2022

Ah yeah, I met with them a few times to help out. Not sure if it's exactly public but docs:

https://docs.particular.net/samples/open-telemetry/

And here's the code:

https://github.com/Particular/NServiceBus/tree/master/src/NServiceBus.Core/OpenTelemetry

I haven't reviewed the code yet but any time a bug comes up here I ping their team in case they'll hit it too.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants