-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluating how to support tracing and OpenTelemetry #776
Comments
I also suggest looking into the new Tracing API on Activity which should be part of .NET 5 release. |
Yeah, that looks good. A good first step would be to add basic Activity support, ETW and event counters. I'm planning to look at that soon-ish. This would be a logical next step after that. |
Hi @stebet, just wanted to follow-up on your previous note and see if you had any sorta update or a product roadmap you could share around Activity support? |
@jaycdave88 right now the plan is to investigate System.Diagnostics.DiagnosticSource. This may or may not happen by 7.0 but can go into a |
@michaelklishin Depending on the timeline, definitely check out the new ActivitySource API. Very similar to DiagnosticSource, but it supports sampling and you can write tags/attributes on the events instead of writing raw objects (which people usually need to consume reflectively). It releases with .NET 5 (~November), but you don't need .NET 5 to use it. The NuGet package targets all actively supported .NET versions. The main benefit to using ActivitySource is it means instrumentation libraries like OpenTelemetry will work without needing to build an adapter or anything. Also happy to help with this if you are open to it. |
If ActivitySource will support all then/currently supported versions of .NET, we should consider it. A proof-of-concept PR would be appreciated (I assume there are preview releases of some kind that developers can try?) |
ActivitySource should be the correct one to look at. I spiked this myself a few months back, time to revisit :) What would people consider good activities/counters to start with? Here are a few rough ideas:
|
|
I think something like |
Just to clarify - ActivitySource/Activity API is for tracing , and not for metrics. Things like BytesSent, etc are metric, which require a metric API. .NET has |
@cijothomas Yep. I'm aware of the differences. We're looking to integrate both the new dotnet counters (for metrics) as well as support for Activity Tracing. Tracing will come later since the OpenTelemetry is currently being worked on but I'm watching it closely :) |
@stebet Thanks! Please also check the Metric plans in .NET 6 as well. |
Thanks for the info :) |
Also consider adopting the recommended tags and naming conventions from OpenTelemetry. And if you need an example of dealing with ActivitySource, I added support for the Trace Context standard and activity source for NServiceBus: |
I wanted to play a bit with it and found that the PollingCounters are only NS2.1 / .net core 3.0. |
Let's make a distinction between activities and counters since they aren't the same thing (although related). There are Event counters in NS2.0 targets, but it's a lot less capable. It'd be fine by me if we only had counters for NS2.1+ and skip them for NS2.0, which is on the way out eventually anyway. Activities can work for both targets. |
Shall I create a new one for the counters, so we don't mix them up in here?
Are there? I wasn't able to find any useable classes?
Works for me 👍 |
When using OpenTelemetry for tracing across services, the span context needs to be channeled over the wire. Will this issue contain the encoding (while publishing) and decoding (while receiving) from and to the .NET Activity framework be included here or is this expected to be done somewhere else? To my limited insights, I do not see a hookup infrastructure where a third party could include a middleware into the bare RabbitMQ.Client when sending/receiving messages. I am currently building a middleware layer on top of RabbitMQ and wonder, whether I should write a "default" middleware doing the OpenTelemetry decoding. The code how this OpenTelemetry span context propagation works without support from the |
We similarly have a wrapper around RabbitMQ.Client we've been using for a few years (that exposes consumers/exchanges as We currently extract/inject trace context into message headers on consume/publish but it would be nice if the base client library handled that as well as the Where I'm uncertain is how to handle the length of the Activity (we only use Consumer based flows, but manual Get would likely need to be different?). There is the period from Prefetch to the Consumer implementation finishing the event/callback. The "processing" of the message could either be inside that span, resulting in an ack/nack before returning from the Consumer event/callback or, as in our framework, the I suppose that the client side queue could be modeled in OpenTelemetry as another Consumer-type span with a parent as the Consumer event/callback span that would be completed as soon as the event/callback span has placed the message in the in-process queue. |
Is this issue well understood? Are there any design blockers getting this through? |
@davidfowl it has been over two years since it was last discussed. So no, it is not well defined. |
I've been following this so far https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/examples/MicroserviceExample/Utils/Messaging It's an example on the open telemetry dotnet repo that uses RabbitMQ.Client and adds the appropriate activities to the producer and consumer. If we start with this as the proposal, what issues would we run into? |
I think it may be worth to share with you the OpenTelemetry semantic conventions (they are experimental, but there is nothing better) for telemetry produced for RabbitMQ: |
That question can only be answered by trying it but this looks like an example, not something that can or should be taken straight into production. This example has some curious decisions around how it uses this client. Instead of inheriting from/extending I assume that everything around how they use activities is reasonable. The question for library maintains use how do you integrate that in a way that's not too restrictive and does not push OpenTelemetry down everyone's throat. |
I'd look for clients for other popular data services (PostgreSQL, Redis, ElasticSearch, etcd, et cetera) to learn from, these examples only demonstrate some key OpenTelemetry API elements (in C#). |
@pellared that's helpful, thanks. Now someone who needs this feature should be able to come up with an API update proposal for this client, which won't be trivial. Team RabbitMQ likely won't get to work on this any time soon but this is open source software. |
This can be contributed, it’s extremely important to get our core ecosystem libraries instrumented. Let’s get on the same page about how to accomplish this. The way .NET integrates with open telemetry allows library authors to build support without direct dependencies on otel. Even though the semantic conventions are not final, you want to track and use them. This is what we are doing in .NET and making changes as they change . |
The other alternative would be to build it an instrumentation library on top, but, that’s not ideal and you’d need the appropriate hooks in order to do it from the outside of the library. This would be a point in time solution. This isn’t the way we want libraries to integrate with our telemetry systems. |
This is already in progress for the most part here: #1261 Had some comments from @lmolkova as well that I need to address. I'll take a look at this again soon. Would love more feedback as well. Some attributes need to be updated to be in line with the standards (it's been awhile since I last synced), but this is working for the most part. CC: @davidfowl @michaelklishin |
Also, implementing it like this via. ActivitySource shouldn't require any API changes. A good PR after the initial implentation would be to add OTel Metrics/Logs support as well. |
That looks promising, is there anything blocking this change? |
Mostly cleanup and responding to comments. I'll get to it today :) |
This is something that I have been wondering about for some time on what would be the best approach for observability.
Since debugging a library like this can be hard, I wonder what we'd want to do to make this easier.
I'd really like to suggest using System.Diagnostics.DiagnosticSource to provide parity with a lot of other .NET libraries and frameworks (both Core and Framework).
Using this library it becomes a lot easier to plug in logging frameworks and profiling tools to monitor and measure operations within the library. It also has the added benefit to have next to no overhead when no-one is listening for the events, since it is exposing ETW events (and in .NET Core, can expose event counters).
The text was updated successfully, but these errors were encountered: