-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trace context propagation docs and enable distributed traces by default #1557
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1557 +/- ##
==========================================
- Coverage 71.17% 71.13% -0.04%
==========================================
Files 197 197
Lines 19668 19671 +3
==========================================
- Hits 13998 13993 -5
- Misses 4993 4997 +4
- Partials 677 681 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Will leave it for the @grafsean to approve
docs/sources/distributed-traces.md
Outdated
|
||
Beyla will read any incoming trace context header values, track the Go program execution flow and propagate the trace context by automatically adding the `traceparent` field in outgoing HTTP/gRPC requests. If an application already adds the `taceparent` field in outgoing requests, Beyla will use that value for tracing instead its own generated trace context. If Beyla cannot find an incoming `traceparent` context value, it will generate one according to the W3C specification. | ||
Beyla will read any incoming trace context header values, track the program execution flow and propagate the trace context by automatically adding the `traceparent` field in outgoing HTTP/gRPC requests. If an application already adds the `taceparent` field in outgoing requests, Beyla will use that value for tracing instead its own generated trace context. If Beyla cannot find an incoming `traceparent` context value, it will generate one according to the W3C specification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo taceparent
docs/sources/distributed-traces.md
Outdated
|
||
For TLS encrypted traffic (HTTPS), Beyla is unable to inject the trace information in the outgoing HTTP headers and instead it injects the information | ||
at TCP/IP packet level. Because of this limitation, Beyla is only able to send the trace information to other Beyla instrumented services. L7 proxies | ||
and load balancers will also disrupt the TCP/IP context propagation, becuase the original packets are discarded and replayed downstream. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: becuase
I made some code changes to fix things, do you mind looking over the code changes? I found a bug that our TC code was overriding a manually set TP in an outgoing request. NodeClient tests for this. |
| ---------------------------- | -------------------------------------- | ------- | ------- | | ||
| `enable_context_propagation` | `BEYLA_BPF_ENABLE_CONTEXT_PROPAGATION` | boolean | (true) | | ||
|
||
Enables injecting of the `Traceparent` header value for outgoing HTTP requests, allowing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to link somewhere here to docs/sources/distributed-traces.md
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't we forgetting to mention limitations with Java? or we already overcame those?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the docs. I'll copy edit when I do a pass over these files.
I think we need to separately document those in limitations.md, since they are not related to the ability to write outgoing header values, but more related to internal thread handoff. |
@@ -134,6 +137,11 @@ static __always_inline void http_get_or_create_trace_info(http_connection_metada | |||
if (meta) { | |||
u32 type = trace_type_from_meta(meta); | |||
set_trace_info_for_connection(conn, type, tp_p); | |||
// If the user code setup traceparent manually, don't interfere and add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Non-rhetorical question) why does this matter? For (outbound) client calls, wouldn't we want to update the outgoing_trace_map
(done in server_or_client_trace()
) so that the tc egress program can refer back to it and write the traceparent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think we shouldn't mess with the user application code. For example, imagine a user has done this in their code:
const options = {
headers: {
"traceparent": ...
}
}
http.get(url, options, (res) => {
... do something
});
They expect that their downstream service will get this traceparent, I don't know why would they do this, but they can. If we don't check for this condition and setup the outgoing trace map, then Beyla will overwrite the span_id of the outgoing request. The trace_id component will be respected, but the span_id will get generated and updated. This span_id manipulation happens because we use the TCP seq/ack for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the same time, this is the user sending the traceparent on their own, all the TC code would do in this case is just duplicate the header, which will have no effect.
@@ -138,11 +138,16 @@ func (p *Tracer) startTC(ctx context.Context) { | |||
return | |||
} | |||
|
|||
if p.cfg.EBPF.UseTCForL7CP { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if instead of trying to enforce this at the level where the tracers are created, we should enforce this at the config validation level, not letting it propagate....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's true, we'll need to change the tests to reflect that, because we now have to explicitly disable ContextPropagation if TCL7 is used. Do you think it's worth the effort since we don't actually support this option?
ah sorry, I thought this is the same document |
Add documentation for the new context propagation support and enable it by default.