Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing context propagation using Istio #2890

Closed
wtcampos opened this issue Apr 29, 2021 · 14 comments
Closed

Missing context propagation using Istio #2890

wtcampos opened this issue Apr 29, 2021 · 14 comments
Labels
bug Something isn't working

Comments

@wtcampos
Copy link

Describe the bug
Missing context propagation using service mesh Istio.

Steps to reproduce
Using the auto instrumentation agent, the context propagation is missing when I do a call through Istio Gateway as Ingress, it looks very similar to #2229 .I'm using the latest version from the agent v.1.1.0 and the application is also using WebClient/Netty. I tried using a RestTemplate and I couldn't reproduce the problem, it's working fine with RestTemplate. What I could see in the logs is:

There's one error regarding an invalid span context TraceId in B3 header:

2021-04-29 09:04:00.617  INFO 1 --- [           main] o.s.b.web.embedded.netty.NettyWebServer  : Netty started on port(s): 8080
2021-04-29 09:04:00.658  INFO 1 --- [           main] c.t.p.s.SatelliteImageProxyApplication   : Started SatelliteImageProxyApplication in 21.854 seconds (JVM running for 26.58)
[opentelemetry.auto.trace 2021-04-29 09:04:09:780 +0000] [reactor-http-epoll-1] DEBUG io.opentelemetry.javaagent.tooling.InstrumentationModule - Applying instrumentation: netty -- io.opentelemetry.javaagent.instrumentation.netty.v4_1.NettyInstrumentationModule on org.springframework.boot.loader.LaunchedURLClassLoader@49ede9c7
[opentelemetry.auto.trace 2021-04-29 09:04:09:784 +0000] [reactor-http-epoll-1] DEBUG io.opentelemetry.javaagent.tooling.AgentInstaller$TransformLoggingListener - Transformed io.netty.bootstrap.ServerBootstrap$ServerBootstrapAcceptor$2 -- org.springframework.boot.loader.LaunchedURLClassLoader@49ede9c7
[opentelemetry.auto.trace 2021-04-29 09:04:09:909 +0000] [reactor-http-epoll-2] DEBUG io.opentelemetry.javaagent.tooling.InstrumentationModule - Applying instrumentation: netty -- io.opentelemetry.javaagent.instrumentation.netty.v4_1.NettyInstrumentationModule on org.springframework.boot.loader.LaunchedURLClassLoader@49ede9c7
[opentelemetry.auto.trace 2021-04-29 09:04:09:914 +0000] [reactor-http-epoll-2] DEBUG io.opentelemetry.javaagent.tooling.AgentInstaller$TransformLoggingListener - Transformed io.netty.handler.codec.http2.Http2ControlFrameLimitEncoder$1 -- org.springframework.boot.loader.LaunchedURLClassLoader@49ede9c7
[opentelemetry.auto.trace 2021-04-29 09:04:09:978 +0000] [reactor-http-epoll-3] DEBUG io.opentelemetry.javaagent.tooling.InstrumentationModule - Applying instrumentation: executor -- io.opentelemetry.javaagent.instrumentation.javaconcurrent.ExecutorInstrumentationModule on org.springframework.boot.loader.LaunchedURLClassLoader@49ede9c7
[opentelemetry.auto.trace 2021-04-29 09:04:09:978 +0000] [reactor-http-epoll-3] DEBUG io.opentelemetry.javaagent.tooling.InstrumentationModule - Applying instrumentation: netty -- io.opentelemetry.javaagent.instrumentation.netty.v4_1.NettyInstrumentationModule on org.springframework.boot.loader.LaunchedURLClassLoader@49ede9c7
[opentelemetry.auto.trace 2021-04-29 09:04:09:990 +0000] [reactor-http-epoll-3] DEBUG io.opentelemetry.javaagent.tooling.AgentInstaller$TransformLoggingListener - Transformed reactor.netty.http.server.HttpTrafficHandler -- org.springframework.boot.loader.LaunchedURLClassLoader@49ede9c7
[opentelemetry.auto.trace 2021-04-29 09:04:10:012 +0000] [reactor-http-epoll-2] DEBUG io.opentelemetry.extension.trace.propagation.B3PropagatorExtractorMultipleHeaders - Invalid TraceId in B3 header: null'. Returning INVALID span context.
[opentelemetry.auto.trace 2021-04-29 09:04:10:012 +0000] [reactor-http-epoll-3] DEBUG io.opentelemetry.extension.trace.propagation.B3PropagatorExtractorMultipleHeaders - Invalid TraceId in B3 header: null'. Returning INVALID span context.

And also some invalid traces ids:

[opentelemetry.auto.trace 2021-04-29 09:04:10:202 +0000] [reactor-http-epoll-3] DEBUG io.opentelemetry.javaagent.instrumentation.javaconcurrent.AbstractExecutorInstrumentation - Skipping executor instrumentation for reactor.core.scheduler.BoundedElasticScheduler$BoundedScheduledExecutorService
[opentelemetry.auto.trace 2021-04-29 09:04:10:205 +0000] [reactor-http-epoll-3] DEBUG io.opentelemetry.javaagent.instrumentation.api.concurrent.State - Failed to set parent context because another parent context is already set io.opentelemetry.javaagent.instrumentation.api.concurrent.State@770d7219: new: {opentelemetry-trace-span-key=RecordEventsReadableSpan{traceId=8d264684e11f2c8c21b38a49511064ab, spanId=ff36ce890f1f968a, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=/actuator/prometheus, kind=SERVER, attributes=AttributesMap{data={http.user_agent=Prometheus/2.24.0, http.flavor=1.1, thread.id=18, net.peer.ip=127.0.0.1, thread.name=reactor-http-epoll-3, http.method=GET, http.client_ip=127.0.0.1, http.url=http://localhost:8080/actuator/prometheus, net.peer.port=32986}, capacity=128, totalAddedValues=9}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1619687050013002700, endEpochNanos=0}, opentelemetry-traces-server-span-key=RecordEventsReadableSpan{traceId=8d264684e11f2c8c21b38a49511064ab, spanId=ff36ce890f1f968a, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=/actuator/prometheus, kind=SERVER, attributes=AttributesMap{data={http.user_agent=Prometheus/2.24.0, http.flavor=1.1, thread.id=18, net.peer.ip=127.0.0.1, thread.name=reactor-http-epoll-3, http.method=GET, http.client_ip=127.0.0.1, http.url=http://localhost:8080/actuator/prometheus, net.peer.port=32986}, capacity=128, totalAddedValues=9}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1619687050013002700, endEpochNanos=0}, thread-propagation-locations=[[Ljava.lang.StackTraceElement;@1c814538]}, old: {opentelemetry-trace-span-key=RecordEventsReadableSpan{traceId=8d264684e11f2c8c21b38a49511064ab, spanId=ff36ce890f1f968a, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=/actuator/prometheus, kind=SERVER, attributes=AttributesMap{data={http.user_agent=Prometheus/2.24.0, http.flavor=1.1, thread.id=18, net.peer.ip=127.0.0.1, thread.name=reactor-http-epoll-3, http.method=GET, http.client_ip=127.0.0.1, http.url=http://localhost:8080/actuator/prometheus, net.peer.port=32986}, capacity=128, totalAddedValues=9}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1619687050013002700, endEpochNanos=0}, opentelemetry-traces-server-span-key=RecordEventsReadableSpan{traceId=8d264684e11f2c8c21b38a49511064ab, spanId=ff36ce890f1f968a, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=/actuator/prometheus, kind=SERVER, attributes=AttributesMap{data={http.user_agent=Prometheus/2.24.0, http.flavor=1.1, thread.id=18, net.peer.ip=127.0.0.1, thread.name=reactor-http-epoll-3, http.method=GET, http.client_ip=127.0.0.1, http.url=http://localhost:8080/actuator/prometheus, net.peer.port=32986}, capacity=128, totalAddedValues=9}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1619687050013002700, endEpochNanos=0}, thread-propagation-locations=[[Ljava.lang.StackTraceElement;@45a03270]}
[opentelemetry.auto.trace 2021-04-29 09:04:10:207 +0000] [reactor-http-epoll-2] DEBUG io.opentelemetry.javaagent.instrumentation.api.concurrent.State - Failed to set parent context because another parent context is already set io.opentelemetry.javaagent.instrumentation.api.concurrent.State@e1b9f8c: new: {opentelemetry-trace-span-key=RecordEventsReadableSpan{traceId=c5fd2f1b70095d6dd47d4b48dd7de0a1, spanId=8080c3af52114e2f, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=/actuator/health/{*path}, kind=SERVER, attributes=AttributesMap{data={http.user_agent=kube-probe/1.18, http.flavor=1.1, thread.id=17, net.peer.ip=127.0.0.1, thread.name=reactor-http-epoll-2, http.method=GET, http.client_ip=127.0.0.1, http.url=http://localhost:8080/actuator/health/readiness, net.peer.port=32982}, capacity=128, totalAddedValues=9}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1619687050013003200, endEpochNanos=0}, opentelemetry-traces-server-span-key=RecordEventsReadableSpan{traceId=c5fd2f1b70095d6dd47d4b48dd7de0a1, spanId=8080c3af52114e2f, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=/actuator/health/{*path}, kind=SERVER, attributes=AttributesMap{data={http.user_agent=kube-probe/1.18, http.flavor=1.1, thread.id=17, net.peer.ip=127.0.0.1, thread.name=reactor-http-epoll-2, http.method=GET, http.client_ip=127.0.0.1, http.url=http://localhost:8080/actuator/health/readiness, net.peer.port=32982}, capacity=128, totalAddedValues=9}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1619687050013003200, endEpochNanos=0}, thread-propagation-locations=[[Ljava.lang.StackTraceElement;@22132978]}, old: {opentelemetry-trace-span-key=RecordEventsReadableSpan{traceId=c5fd2f1b70095d6dd47d4b48dd7de0a1, spanId=8080c3af52114e2f, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=/actuator/health/{*path}, kind=SERVER, attributes=AttributesMap{data={http.user_agent=kube-probe/1.18, http.flavor=1.1, thread.id=17, net.peer.ip=127.0.0.1, thread.name=reactor-http-epoll-2, http.method=GET, http.client_ip=127.0.0.1, http.url=http://localhost:8080/actuator/health/readiness, net.peer.port=32982}, capacity=128, totalAddedValues=9}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1619687050013003200, endEpochNanos=0}, opentelemetry-traces-server-span-key=RecordEventsReadableSpan{traceId=c5fd2f1b70095d6dd47d4b48dd7de0a1, spanId=8080c3af52114e2f, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=/actuator/health/{*path}, kind=SERVER, attributes=AttributesMap{data={http.user_agent=kube-probe/1.18, http.flavor=1.1, thread.id=17, net.peer.ip=127.0.0.1, thread.name=reactor-http-epoll-2, http.method=GET, http.client_ip=127.0.0.1, http.url=http://localhost:8080/actuator/health/readiness, net.peer.port=32982}, capacity=128, totalAddedValues=9}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1619687050013003200, endEpochNanos=0}, thread-propagation-locations=[[Ljava.lang.StackTraceElement;@191e3d31]}
[opentelemetry.auto.trace 2021-04-29 09:04:10:262 +0000] [boundedElastic-1] DEBUG io.opentelemetry.javaagent.instrumentation.api.concurrent.State - Failed to set parent context because another parent context is already set io.opentelemetry.javaagent.instrumentation.api.concurrent.State@18183d7c: new: {opentelemetry-trace-span-key=RecordEventsReadableSpan{traceId=8d264684e11f2c8c21b38a49511064ab, spanId=ff36ce890f1f968a, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=/actuator/prometheus, kind=SERVER, attributes=AttributesMap{data={http.user_agent=Prometheus/2.24.0, http.flavor=1.1, thread.id=18, net.peer.ip=127.0.0.1, thread.name=reactor-http-epoll-3, http.method=GET, http.client_ip=127.0.0.1, http.url=http://localhost:8080/actuator/prometheus, net.peer.port=32986}, capacity=128, totalAddedValues=9}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1619687050013002700, endEpochNanos=0}, opentelemetry-traces-server-span-key=RecordEventsReadableSpan{traceId=8d264684e11f2c8c21b38a49511064ab, spanId=ff36ce890f1f968a, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=/actuator/prometheus, kind=SERVER, attributes=AttributesMap{data={http.user_agent=Prometheus/2.24.0, http.flavor=1.1, thread.id=18, net.peer.ip=127.0.0.1, thread.name=reactor-http-epoll-3, http.method=GET, http.client_ip=127.0.0.1, http.url=http://localhost:8080/actuator/prometheus, net.peer.port=32986}, capacity=128, totalAddedValues=9}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1619687050013002700, endEpochNanos=0}, thread-propagation-locations=[[Ljava.lang.StackTraceElement;@1b47e633]}, old: {opentelemetry-trace-span-key=RecordEventsReadableSpan{traceId=8d264684e11f2c8c21b38a49511064ab, spanId=ff36ce890f1f968a, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=/actuator/prometheus, kind=SERVER, attributes=AttributesMap{data={http.user_agent=Prometheus/2.24.0, http.flavor=1.1, thread.id=18, net.peer.ip=127.0.0.1, thread.name=reactor-http-epoll-3, http.method=GET, http.client_ip=127.0.0.1, http.url=http://localhost:8080/actuator/prometheus, net.peer.port=32986}, capacity=128, totalAddedValues=9}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1619687050013002700, endEpochNanos=0}, opentelemetry-traces-server-span-key=RecordEventsReadableSpan{traceId=8d264684e11f2c8c21b38a49511064ab, spanId=ff36ce890f1f968a, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=/actuator/prometheus, kind=SERVER, attributes=AttributesMap{data={http.user_agent=Prometheus/2.24.0, http.flavor=1.1, thread.id=18, net.peer.ip=127.0.0.1, thread.name=reactor-http-epoll-3, http.method=GET, http.client_ip=127.0.0.1, http.url=http://localhost:8080/actuator/prometheus, net.peer.port=32986}, capacity=128, totalAddedValues=9}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1619687050013002700, endEpochNanos=0}, thread-propagation-locations=[[Ljava.lang.StackTraceElement;@1cde4c63]}
[opentelemetry.auto.trace 2021-04-29 09:04:10:267 +0000] [boundedElastic-2] DEBUG io.opentelemetry.javaagent.instrumentation.api.concurrent.State - Failed to set parent context because another parent context is already set io.opentelemetry.javaagent.instrumentation.api.concurrent.State@7428b956: new: {opentelemetry-trace-span-key=RecordEventsReadableSpan{traceId=c5fd2f1b70095d6dd47d4b48dd7de0a1, spanId=8080c3af52114e2f, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=/actuator/health/{*path}, kind=SERVER, attributes=AttributesMap{data={http.user_agent=kube-probe/1.18, http.flavor=1.1, thread.id=17, net.peer.ip=127.0.0.1, thread.name=reactor-http-epoll-2, http.method=GET, http.client_ip=127.0.0.1, http.url=http://localhost:8080/actuator/health/readiness, net.peer.port=32982}, capacity=128, totalAddedValues=9}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1619687050013003200, endEpochNanos=0}, opentelemetry-traces-server-span-key=RecordEventsReadableSpan{traceId=c5fd2f1b70095d6dd47d4b48dd7de0a1, spanId=8080c3af52114e2f, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=/actuator/health/{*path}, kind=SERVER, attributes=AttributesMap{data={http.user_agent=kube-probe/1.18, http.flavor=1.1, thread.id=17, net.peer.ip=127.0.0.1, thread.name=reactor-http-epoll-2, http.method=GET, http.client_ip=127.0.0.1, http.url=http://localhost:8080/actuator/health/readiness, net.peer.port=32982}, capacity=128, totalAddedValues=9}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1619687050013003200, endEpochNanos=0}, thread-propagation-locations=[[Ljava.lang.StackTraceElement;@2a1a47d]}, old: {opentelemetry-trace-span-key=RecordEventsReadableSpan{traceId=c5fd2f1b70095d6dd47d4b48dd7de0a1, spanId=8080c3af52114e2f, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=/actuator/health/{*path}, kind=SERVER, attributes=AttributesMap{data={http.user_agent=kube-probe/1.18, http.flavor=1.1, thread.id=17, net.peer.ip=127.0.0.1, thread.name=reactor-http-epoll-2, http.method=GET, http.client_ip=127.0.0.1, http.url=http://localhost:8080/actuator/health/readiness, net.peer.port=32982}, capacity=128, totalAddedValues=9}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1619687050013003200, endEpochNanos=0}, opentelemetry-traces-server-span-key=RecordEventsReadableSpan{traceId=c5fd2f1b70095d6dd47d4b48dd7de0a1, spanId=8080c3af52114e2f, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, name=/actuator/health/{*path}, kind=SERVER, attributes=AttributesMap{data={http.user_agent=kube-probe/1.18, http.flavor=1.1, thread.id=17, net.peer.ip=127.0.0.1, thread.name=reactor-http-epoll-2, http.method=GET, http.client_ip=127.0.0.1, http.url=http://localhost:8080/actuator/health/readiness, net.peer.port=32982}, capacity=128, totalAddedValues=9}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1619687050013003200, endEpochNanos=0}, thread-propagation-locations=[[Ljava.lang.StackTraceElement;@78fb933a]}

If I do a GET directly to the pod running using kubectl port-forward the tracing looks good, but if I use the service mesh and do the same GET via Istio, the traces are broken and not correctly propagated.

I'm using the b3multi as propagation to work with Istio.

What did you expect to see?
The traces right propagated from the Ingress until the application.

What did you see instead?
The full trace right propagated.

What version are you using?
v.1.1.0

Environment
Compiler: OpenJDK 64-Bit Server VM 18.9 (build 11.0.11+9, mixed mode, sharing)
K8s: v1.19
OS: Debian GNU/Linux 10 (buster) Docker image: openjdk:11
Runtime : OpenJDK Runtime Environment 18.9 (build 11.0.11+9)

Additional context
Also happened with v1.0.1. And it looks like the issue is with the library io.opentelemetry.javaagent.netty-4.1, since I don't have any issue with io.opentelemetry.javaagent.apache-httpclient-4.0.

Please let me know if I can provide further information.

@wtcampos wtcampos added the bug Something isn't working label Apr 29, 2021
@trask
Copy link
Member

trask commented Apr 29, 2021

hi @wtcampos!

can you try with the latest SNAPSHOT build? https://oss.sonatype.org/content/repositories/snapshots/io/opentelemetry/javaagent/opentelemetry-javaagent/1.2.0-SNAPSHOT/opentelemetry-javaagent-1.2.0-20210428.035324-7.jar

if it's still an issue in the latest SNAPSHOT build, it would be most helpful if you can create a github repository with a repro of the issue

@wtcampos
Copy link
Author

Hello @trask thanks for the quick reply, I will try the latest version and if it does not solve the problem I will reproduce it in another GH repo.

@wtcampos
Copy link
Author

wtcampos commented May 3, 2021

Hello @trask after some debugging we found out the problem is when Istio is using http2 as protocol selection https://istio.io/latest/docs/ops/configuration/traffic-management/protocol-selection/ and there's no relation to the previous library, sorry for that. I tested it running the latest version from the agent that you sent before, but with no luck, do you know if there's a compatibility issue across HTTP1.1 and HTTP2 ? I could see there's no problem when the spans were instrumented using the same instrumentation on Istio and Envoy using the Zipkin format, I can see the propagation working from HTTP 1.1 and HTTP2, but when it need to propagate using the auto instrumentation agent, I got the invalid context span. I can create a repo with a small application to reproduce the problem. What do you think?

@trask
Copy link
Member

trask commented May 5, 2021

oh, sorry, just saw this here, should we consolidate this discussion with #2909?

@wtcampos
Copy link
Author

wtcampos commented May 5, 2021

Yes, please do it.

@trask
Copy link
Member

trask commented May 5, 2021

sure, let's discuss on this issue, since it has the most context so far.

can you clarify, what Java framework is being used on the client and server side where context is not being propagated correct across http2?

and whether you are seeing context propagation always failing in this scenario, or sporadically failing?

thx!

@wtcampos
Copy link
Author

wtcampos commented May 5, 2021

It's not working at all, I'm using the current instrumentation implemented on Istio and I set b3multi headers on the auto instrumentation agent, so we have the same context propagation headers. I could see the problem when the Envoy is working with HTTP/2 the framework is io.opentelemetry.javaagent.netty-4.1, please check the screenshots:

Using HTTP2 to set the communication on the sidecar, the traces are broken:
Screenshot 2021-05-05 at 19 18 29

And every span it's like a new trace:

117180769-d26d9100-add4-11eb-95bd-ff983fe3e66f

If I change the service name on K8s from HTTP2 to HTTP changing the protocol selection https://istio.io/latest/docs/ops/configuration/traffic-management/protocol-selection/

It works fine:
117181008-1496d280-add5-11eb-996b-990f814e52d4

I was checking the headers on logs and we some 3 different headers:

x-http2-stream-id:"1"
x-http2-scheme:"HTTP"
and there's no content-length:"0" on HTTP2 request

117181244-4e67d900-add5-11eb-8707-f634e7f86bf2 (1)

Please let me know if I can provide further information

@trask
Copy link
Member

trask commented May 5, 2021

It may be related to #2496. Are you using a higher level framework on top of netty (e.g. spring webflux)? Are you using netty on the client side also? any higher level framework on top of netty on the client side (e.g. spring webclient)?

@wtcampos
Copy link
Author

wtcampos commented May 5, 2021

Yes, I'm using io.opentelemetry.javaagent.spring-webflux-5.0. But do you know any limitations re HTTP2? because using the current framework and libraries, it's working fine if I'm using HTTP1.1.

@trask
Copy link
Member

trask commented May 5, 2021

But do you know any limitations re HTTP2

Ya, see #2496, in particular the comment about request pipelining.

@wtcampos
Copy link
Author

wtcampos commented May 5, 2021

But do you know any limitations re HTTP2

Ya, see #2496, in particular the comment about request pipelining.

I'm not sure if it's related, because in my case it's working with HTTP1.1 and I don't have this problem with concurrency anytime, I did a quick test using Tomcat also and I got the same problem using HTTP2 and it was synchronous connection, may is related how Envoy deals with the service connection. I will take a look further. Thanks for your help.

@danhngo-lx
Copy link

Hi @wtcampos, could you figure out the problem? I'm having the very similar issue #4840 but I'm already using HTTP1.1

@trask
Copy link
Member

trask commented Jun 26, 2022

hi @wtcampos @danhngo-lx are you still having this issue? curious if it's maybe resolved in a recent javaagent version, and/or if you could provide a repro that we could use to investigate?

@danhngo-lx
Copy link

@trask It's solved for me in newer version of javaagent since my comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants