-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
detect: refactor the detect package #4481
Conversation
This is a draft because there's more changes. The existing detect package ends up causing a conflict so I need to reimplement some of the functionality such as the delegated tracer and the trace recorder. The big focus from this is mentioned in the description. OpenTelemetry lets the tracer and meter providers have multiple outputs and it would be helpful for some of the applications, like buildx, that use this to be able to register additional outputs that aren't part of the detection system. The detection system is mostly meant for user-facing tracing like otel, jaeger, and prometheus while some of the internal systems that need to exist always, like the delegated tracer, can be configured separately. This should fix an issue with buildx where configuring a tracer will cause traces not to be sent to buildkit since the delegated tracer has a lower priority than the otel and jaeger ones. It'll also clean up some of that code to remove some of the global state that's a bit hard to follow. In particular, applications are expected to use I'm going to work on the part of buildx that's meant to use this before I complete this PR just so it's a bit more obvious why this change is helpful. |
39f66df
to
97291d3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I'll keep an eye out for the buildx PR as well, I think we'll be able to drop some code in Compose and standardize some of the end-user CLI OTel bootstrap
cmd/buildkitd/main.go
Outdated
if err := c.Shutdown(ctx); err != nil { | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this should always shutdown all closers
[even if one fails] and then return an aggregate error
util/tracing/detect/detect_v2.go
Outdated
} | ||
|
||
func Resource(ctx context.Context) (*resource.Resource, error) { | ||
res, err := resource.Detect(ctx, serviceNameDetector{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think serviceNameDetector
can go away entirely as part of this refactor and resource.Default()
can be used directly (discovered this with @thaJeztah @ moby/moby#46830 (comment))
It looks like
resource.Default()
is sufficient:
- It uses
fromEnv{}
detector internally: https://github.com/open-telemetry/opentelemetry-go/blob/v1.19.0/sdk/resource/resource.go#L208fromEnv{}
detector readsOTEL_SERVICE_NAME
: https://github.com/open-telemetry/opentelemetry-go/blob/v1.19.0/sdk/resource/env.go#L53
(serviceNameDetector
is 2+ years old, so I'm guessing it was written before the SDK supported the standard env var, but maybe there's additional subtlety here I'm unaware of)
See also:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The primary difference from what I understand is there's a slight difference in the logic for the default service name. For the environment variable, both will return the same thing. For the default name, you'll get this difference:
buildkit/vendor/go.opentelemetry.io/otel/sdk/resource/builtin.go
Lines 101 to 105 in 5ecfec1
executable, err := os.Executable() | |
if err != nil { | |
return "unknown_service:go", nil | |
} | |
return "unknown_service:" + filepath.Base(executable), nil |
Will cause the default service name to be the executable path.
buildkit/util/tracing/detect/detect.go
Line 228 in f84cfe3
return filepath.Base(os.Args[0]), nil |
This one uses the base path. There doesn't seem to be a good way to modify how service.name
gets set. If you merge it into the default resource, you overwrite the auto-discovered name. If you merge the default resource into it, this default seems to get populated. Although we might be able to use Default -> custom one -> Environment
and that would likely work. I'll see if I can rework it to remove the serviceNameDetector
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's definitely possible to change this.
In particular, resource.New
and then using the detector options like resource.WithFromEnv()
and resource.WithTelemetrySDK()
.
util/tracing/detect/otlp.go
Outdated
} | ||
|
||
func (o otlpMetricFactory) New() (sdkmetric.Reader, error) { | ||
proto := getEnv("grpc", "OTEL_EXPORTER_OTLP_METRICS_PROTOCOL", "OTEL_EXPORTER_OTLP_PROTOCOL") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe OTel changed the default from gRPC -> http/protobuf, so it might make sense to prefer that here since metrics are a new addition?
(BuildKit tracing support was added at a time when gRPC was the default)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea looks like this is the case. We likely will need to be explicit with our own.
https://opentelemetry.io/docs/specs/otel/protocol/exporter/
I'm going to see if the semconv
package has the defaults so we aren't using so many magic strings for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no constant for the default in semconv (seems to be just for attribute names) and I don't see anything in the otlptrace
package. I'm also a bit concerned this might qualify as a breaking change. I do think we should change it to keep the defaults correct according to the spec although we'll need to point it out in the release notes.
97291d3
to
f356b51
Compare
76973cb
to
be13989
Compare
be13989
to
8a1e0d2
Compare
This refactors the detect package with the goal of making it more similar to otel's `autoexport` package and splitting out the additional functionality used by buildkit, like the trace recorder and delegated tracer, to more explicit processors rather than implicit through `autoexport`. This removes the global variables for the trace provider and meter provider along with the global variable for the exporters. This is replaced with functions that create the exporters. The delegated tracer has been removed from detect and moved into the normal tracing util package. This is still used by the command line to send delegated traces, but it's an explicit exporter that's added rather than implicit. Some functions have been renamed mostly to force dependent packages to change their usage rather than have a chance at incorrect usage because the semantics changed. Signed-off-by: Jonathan A. Sternberg <jonathan.sternberg@docker.com>
8a1e0d2
to
228e250
Compare
This refactors the detect package with the goal of making it more
similar to otel's
autoexport
package and splitting out the additionalfunctionality used by buildkit, like the trace recorder and delegated
tracer, to more explicit processors rather than implicit through
autoexport
.This removes the global variables for the trace provider and meter
provider along with the global variable for the exporters. This is
replaced with functions that create the exporters. The delegated tracer
has been removed from detect and moved into the normal tracing util
package. This is still used by the command line to send delegated
traces, but it's an explicit exporter that's added rather than implicit.
Some functions have been renamed mostly to force dependent packages to
change their usage rather than have a chance at incorrect usage because
the semantics changed.