Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to trace gRPC calls and log them? #1313

Closed
asarkar opened this issue Oct 2, 2020 · 43 comments
Closed

How to trace gRPC calls and log them? #1313

asarkar opened this issue Oct 2, 2020 · 43 comments

Comments

@asarkar
Copy link

asarkar commented Oct 2, 2020

I'm looking for a way to trace gRPC calls in a Kotlin + Spring Boot application, and for time being, only interested in finding the logs by a trace id. I started with Spring Cloud Sleuth, but immediately hit the bug spring-cloud/spring-cloud-sleuth#1749.

I've this application with a gRPC server (A) that calls a gRPC client (B) that calls an external gRPC service. It all starts with a ServerInterceptor injecting a user id in the Context that is required by the service A; since A doesn't have access to the headers, context is the only way to propagate metadata from the client.

gRPC has OOTB support for OpenCensus, so, for a new project, I'm inclined to use OpenTelemetry if possible. However, my Google searches have turned up absolutely nothing related to my simple question: "How to trace gRPC calls and have them show up in the logs?" At this time, I'm not interested in exporting them to another system like Jaeger.

I'd like to avoid the magic of a javaagent, and instead have the instrumentation supported in the code first hand, like @mabdinur did in #661 for HTTP. This is a preference, and not a must.

Part of the problem might be that I've user-level knowledge of distributed tracing, but by no means I'm an expert. @trask and @anuraaga, from scouring gRPC related issues in this repo, you two seem to be the experts in this domain, so, I'll really appreciate if you can throw me a bone or two to get started. If something reusable comes up in my quest, like the PR 661, I'll be happy to contribute it back to this project.

Using grpc-kotlin v0.1.5, Spring Boot 2.3.4.RELEASE, and Logback.

@anuraaga
Copy link
Contributor

anuraaga commented Oct 3, 2020

Hi @asarkar - unfortunately our current instrumentation for gRPC relies on using the javaagent. We are in the process of splitting out library instrumentation, which can be used without the agent, and gRPC is next on the list. We would essentially provide you a library with a ServerInterceptor and ClientInterceptor implementation you register on a ServerBuilder or ManagedChannelBuilder.

In the meantime, if avoiding the agent isn't a must, maybe you can try playing with it. If you set OTEL_EXPORTER environment variable to logging, it will just log spans.

https://github.com/open-telemetry/opentelemetry-java-instrumentation#logging-exporter

How does that sound? We'll also get to extracting gRPC instrumentation soon, thanks for pushing for it.

@asarkar
Copy link
Author

asarkar commented Oct 3, 2020

@anuraaga
I can’t really wait for the changes you described, so meanwhile, I’ll extract the gRPC instrumentation in a library myself, and submit a PR. I’m very familiar with Spring Boot, so it should not be a problem for me to create a gRPC autoconfiguration by copy-pasting some of the javaagent code. I’ll ask questions if I get stuck in the process, but otherwise, expect to see something in the next two weeks.

It’s not clear to me how all of this ties to Logback; I saw that you made a related PR, can you help me understand how to print the trace/span in console logs?

@anuraaga
Copy link
Contributor

anuraaga commented Oct 3, 2020

@asarkar Thanks for interest in helping! The first step isn't related to Spring actually, we first need to extract out just gRPC library instrumentation without any autoconfiguration, from the agent code here

https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/master/instrumentation/grpc-1.5/src/main/java/io/opentelemetry/instrumentation/auto/grpc

It's splitting out the Interceptor classes, along with helpers, to become a similar structure as our Armeria instrumentation here https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/master/instrumentation/armeria-1.0/library/src/main/java/io/opentelemetry/instrumentation/armeria/v1_0/server/OpenTelemetryService.java

Since it sounds like you need this soon, I was thinking of doing this early next week - I'd rather split out from the start instead of copy-pasting the instrumentation or it'll get hard for us to manage the instrumentation bits. Toughing the auto instrumentation can be a bit tricky though, mostly since we lack good docs for it yet, so if that sounds tricky I'm happy to knock this out.

It’s not clear to me how all of this ties to Logback; I saw that you made a related PR, can you help me understand how to print the trace/span in console logs?

I think you just want to see span information in the logs. We have an exporter that just dumps the spans to stdout. So if that's configured you would see spans

https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/master/javaagent-exporters/logging/src/main/java/io/opentelemetry/javaagent/exporters/logging/LoggingExporter.java#L36

@trask It does make me wonder if we really should migrate to slf4j for that exporter. stdout is really only helpful in the most basic of experimenting, but if slf4j it could be used in production too.

@asarkar
Copy link
Author

asarkar commented Oct 3, 2020

Early next week is fine, given my current knowledge and time, I’ll not be able to make it any sooner. If you start something, I’ll be happy to help.
As for logging, what I really meant is any Logback Appender. If you’re looking at SLF4J, you could support any underlying implementation.

@asarkar
Copy link
Author

asarkar commented Oct 3, 2020

BTW, after looking at the armeria project, IMO, the auto and the library could be renamed to spring-boot-autoconfiguration and core, respectively. auto means nothing, really, and all OSS is either a library or a framework, so library is vague too.

@iNikem
Copy link
Contributor

iNikem commented Oct 5, 2020

@asarkar What is a connection between Armeria and Spring Boot?

@asarkar
Copy link
Author

asarkar commented Oct 5, 2020

@iNikem

What is a connection between Armeria and Spring Boot?

I had assumed auto is spring boot autoconfiguration as there is such support in some cases (#661). However, after reviewing the code, it seems that my assumption was incorrect, auto isn’t related to spring boot.

That said, my talking point stands, which is, the name auto doesn’t tell me or anyone what that directory is supposed to contain. Does it have code for automobiles? No one will know unless they dig in.

@iNikem
Copy link
Contributor

iNikem commented Oct 5, 2020

There is contributing doc, which references "Writing instrumentation" doc, which explains the difference between manual and auto instrumentations and explains the folder structure. Have you seen that?

@asarkar
Copy link
Author

asarkar commented Oct 5, 2020

@iNikem I didn't have a reason to read the contrib doc so far, since I hadn't submitted a PR, but now that you referred to it, I'm going to read it.

@trask
Copy link
Member

trask commented Oct 5, 2020

@trask It does make me wonder if we really should migrate to slf4j for that exporter. stdout is really only helpful in the most basic of experimenting, but if slf4j it could be used in production too.

@anuraaga makes sense to me, logging at INFO level, which we enable by default

@trask
Copy link
Member

trask commented Oct 5, 2020

@trask It does make me wonder if we really should migrate to slf4j for that exporter. stdout is really only helpful in the most basic of experimenting, but if slf4j it could be used in production too.

@anuraaga makes sense to me, logging at INFO level, which we enable by default

created #1328 to not forget 😁

@trask
Copy link
Member

trask commented Oct 5, 2020

That said, my talking point stands, which is, the name auto doesn’t tell me or anyone what that directory is supposed to contain. Does it have code for automobiles?

😂 we are planning to rename the auto directory to javaagent, partly due to this issue of auto being too generic #1186 (comment)

@asarkar
Copy link
Author

asarkar commented Oct 5, 2020

So, I started by copy-pasting the code under instrumentation/grpc-1.5, but some classes refer to the instrumentation-api which doesn't seem to be available as an artifact either in Maven Central or Jcenter. The Beta v0.8.0 release notes claim:

Java agent binary and modules are now published at io.opentelemetry.instrumentation.auto:opentelemetry-javaagent/opentelemetry-javaagent-bootstrap/opentelemetry-javaagent-tooling

Question is, "now published at", where?

It appears that the various modules in this repo are so tightly coupled that they can't exist outside the javaagent, which is a kitchen sink for everything. Perhaps this is what @anuraaga referred to earlier when he said "unfortunately our current instrumentation for gRPC relies on using the javaagent."

@anuraaga
Copy link
Contributor

anuraaga commented Oct 6, 2020

@asarkar #1329 will separate out a library to be able to trace gRPC without the java agent.

@asarkar
Copy link
Author

asarkar commented Oct 7, 2020

@anuraaga Now that #1329 is merged, can I give it a shot with 0.9.0-SNAPSHOT, or are there more changes you're working on (I checked the open PRs, and didn't see any related ones, just asking)?
Also, I asked this a few times before but unfortunately didn't get an answer, is there a logback.xml that'd pull the trace/span from the MDC?

@trask
Copy link
Member

trask commented Oct 7, 2020

@anuraaga
Copy link
Contributor

anuraaga commented Oct 8, 2020

@asarkar I sent one more PR to fix the context issue, then that snapshot should be good to give it a try. Thanks for the patience.

@asarkar
Copy link
Author

asarkar commented Oct 8, 2020

OK, but I'm not seeing anything related to metrics in the PR. Metrics is instrumentation too, but not directly related to tracing, so, do I need to create a new ticket for that?

@trask
Copy link
Member

trask commented Oct 9, 2020

hey @asarkar, the OpenTelemetry metric story is still being defined.

I think this is probably the spec issue for what you are looking for: open-telemetry/opentelemetry-specification#654

@asarkar
Copy link
Author

asarkar commented Oct 13, 2020

@iNikem I'm not seeing the snapshot version for whatever @anuraaga has done here https://bintray.com/open-telemetry/maven/opentelemetry-java-instrumentation.

@anuraaga
Copy link
Contributor

@asarkar Sorry for the wrong link that is the bintray we use for released versions, snapshots are here https://oss.jfrog.org/libs-snapshot/io/opentelemetry/instrumentation/opentelemetry-grpc-1.5/

@asarkar
Copy link
Author

asarkar commented Oct 14, 2020

@anuraaga
Debug:

TracingContextUtils.getCurrentSpan().context.spanIdAsHexString = "0000000000000000"
TracingContextUtils.getCurrentSpan().context.traceIdAsHexString = "00000000000000000000000000000000"

logback:

<pattern>
{
...
"trace": "%X{traceId:-}",
"span": "%X{spanId:-}",
...
}
</pattern>

But trace and span are empty.

{"@timestamp":"2020-10-14T05:12:00.175Z","severity":"DEBUG","trace":"","span":"","thread":"grpc-default-executor-2","class":"MyServerInterceptor","msg":"Copying headers to context ","exception":""}

@anuraaga
Copy link
Contributor

Can you check if it's ok in the gRPC handler method itself and only that the span isn't available in the interceptor? If so will try to create a unit test to repro and fix.

@anuraaga
Copy link
Contributor

@asarkar By the way, one unintuitive aspect of gRPC is that interceptors are run in reverse order. So you'll need to make sure you register TracingServerInterceptor last so that it runs first, filling in the span before your interceptor is executed.

@asarkar
Copy link
Author

asarkar commented Oct 14, 2020

I’m aware of the order of interceptors, and I’ve ensured the trace interceptor runs before mine.
What do you mean by the gRPC handler method? If you mean the gRPC service method, the span and trace are 0 there too.

@anuraaga
Copy link
Contributor

@asarkar Ah also have you added a dependency on any exporter package or the OpenTelemetry SDK? Without it on the classpath, tracing will be disabled.

We had some discussion on how this may be confusing here
#846

@asarkar
Copy link
Author

asarkar commented Oct 14, 2020

No, I hadn’t added any exporters, because I don’t need anything exported. As I mentioned before, for now, all I’m looking for is to get the trace and span ids show up in the logs.
What do you recommend adding? Based on the discussion in #846, it seems the sdk isn’t enough.
Perhaps instead of having to keep reminding people, there could be a warning in the logs like SLF4J shows “No SLF4J providers were found.”

@anuraaga
Copy link
Contributor

If you don't need any exporters right now, then just opentelemetry-sdk should be fine it will initialize tracing such that the gRPC and logback instrumentation can use it.

@asarkar
Copy link
Author

asarkar commented Oct 14, 2020

OK, after adding the sdk, I do see valid trace and span ids by putting a breakpoint in the service method.

Interceptor order:

Registering GlobalServerInterceptor: myServerInterceptor (com.mycompany.MyServerInterceptor@4887ad0c) 
Registering GlobalServerInterceptor: tracingServerInterceptor (io.opentelemetry.instrumentation.grpc.v1_5.server.TracingServerInterceptor@4e518070)
Registering GlobalServerInterceptor: metricCollectingServerInterceptor (net.devh.boot.grpc.server.metric.MetricCollectingServerInterceptor@5ff81f39)
Registering GlobalServerInterceptor: grpcRequestScope (net.devh.boot.grpc.server.scope.GrpcRequestScope@598c3ac0)

But the logs show blank trace and span ids, as shown above, which is the last remaining issue. @anuraaga what may I be missing, any idea?

I can't post the code here since it's proprietary, but in the extreme case, I can do a Zoom call and walk through the problem.

@anuraaga
Copy link
Contributor

Nice to hear that @asarkar. Now I guess you just need to add logging instrumentation too.

If you're using logback, the instructions here
https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/master/instrumentation/logback/logback-1.0.0/library

Or if using log4j2, the instructions here
https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/master/instrumentation/log4j/log4j-2.13.2/library

@asarkar
Copy link
Author

asarkar commented Oct 14, 2020

OK, adding the logback custom appender is working! I'll do some more tests using server and client interceptors, and report back. I really appreciate the help so far.

One other thing: apart from gRPC, I also need to instrument Ktor HTTP Client, which you don't have support for. I'll start by using the API and the SDK, and if I find anything missing (like gRPC instrumentation lib was missing), I'll open a new ticket.

@asarkar
Copy link
Author

asarkar commented Oct 17, 2020

This appears to be working, and surprisingly, the Ktor tracing is also working without me doing anything. Closing the ticket.

@asarkar asarkar closed this as completed Oct 17, 2020
@asarkar
Copy link
Author

asarkar commented Oct 17, 2020

@anuraaga I see that 0.9.0 was released, but I don't see in Jcenter or Maven central.

@trask
Copy link
Member

trask commented Oct 17, 2020

hey @asarkar, we aren't publishing to maven central yet, for now you can use

https://bintray.com/open-telemetry/maven/opentelemetry-java-instrumentation

@asarkar
Copy link
Author

asarkar commented Oct 17, 2020

@trask

There are couple of hoops that I needed to jump through I wish were documented in the release notes.

  1. io.opentelemetry and io.opentelemetry.instrumentation group artifacts are not in sync. The former has no 0.9.0 published version, but 0.9.1 (how does that make any sense???), while the latter has no 0.9.1 published version, but 0.9.0. See https://oss.jfrog.org/artifactory/libs-release/.
  2. To be able to pull in io.opentelemetry.instrumentation release JARs, one needs to add a Maven repository https://bintray.com/open-telemetry/maven in their build file. Unlike Maven Central, browsing this URL doesn't show the JARs, but they seem to be there, hidden. Adding jcenter() works.

@EricBuist
Copy link

Hi, requiring this OpenTelemetry agent is problematic for me as well. I'm working on a library integrated into a microservice and controlling the Java command line arguments in order to add the agent, while possible, is not easy. Moreover, I'm worried that the agent could introduce some other changes in the bytecode that won't be expected. Now I'm upset that I'll find other scenarios where the agent is the only way to solve a given problem. I was asked at my job to migrate a library from OpenTracing to OpenTelemetry and feel more and more going backward. OpenTracing has a contrib package with a client/server interceptors. OpenTracing has some semantic conventions; OpenTelemetry semconv is alpha. Using Micrometer seems better than OpenTelemetry Meter+Prometheus alpha exporter. If I was working on a personal project, I would just stop complaining and remain on OpenTracing+Jaeger for the time being, but I prefer to push forward to OpenTelemetry than spending my time and energy justifying why we need to stay with Jaeger.

@trask
Copy link
Member

trask commented Feb 7, 2023

@EricBuist
Copy link

hi @EricBuist! have you seen the grpc library instrumentation? this doesn't require the Java agent: https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation/grpc-1.6/library#library-instrumentation-for-grpc-160

This seems to be alpha once again. https://search.maven.org/artifact/io.opentelemetry.instrumentation/opentelemetry-grpc-1.6.
Why are most OpenTelemetry related packages still alpha? semconv, Prometheus exporter, and now this one.

@trask
Copy link
Member

trask commented Feb 9, 2023

the OpenTelemetry instrumentations have always been alpha. the OpenTelemetry project decided that "telemetry stability" is a required part of instrumentation stability, which means the gRPC instrumentation cannot be declared stable until the associated semantic conventions have been declared stable: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/rpc.md#semantic-conventions-for-rpc-spans

@trask
Copy link
Member

trask commented Feb 9, 2023

the prometheus exporter is similarly not stable because the underlying specification hasn't been marked stable yet: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk_exporters/prometheus.md#opentelemetry-metrics-exporter---prometheus

@EricBuist
Copy link

Any reason why the grpc instrumentation library is not declared in the opentelemetry-bom or openelemetry-bom-alpha? Repeating these version numbers for many many many many different dependencies starts to make Maven really annoying to use. I was happy to see there is opentelemetry-bom, but it is a bit pointless if OpenTelemetry dependencies may or may not be in the BOM.

@trask
Copy link
Member

trask commented Feb 20, 2023

hi @EricBuist! check out opentelemetry-instrumentation-bom and opentelemetry-instrumentation-bom-alpha

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants