Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detect: refactor the detect package #4481

Merged
merged 1 commit into from
May 7, 2024

Conversation

jsternberg
Copy link
Collaborator

@jsternberg jsternberg commented Dec 12, 2023

This refactors the detect package with the goal of making it more
similar to otel's autoexport package and splitting out the additional
functionality used by buildkit, like the trace recorder and delegated
tracer, to more explicit processors rather than implicit through
autoexport.

This removes the global variables for the trace provider and meter
provider along with the global variable for the exporters. This is
replaced with functions that create the exporters. The delegated tracer
has been removed from detect and moved into the normal tracing util
package. This is still used by the command line to send delegated
traces, but it's an explicit exporter that's added rather than implicit.

Some functions have been renamed mostly to force dependent packages to
change their usage rather than have a chance at incorrect usage because
the semantics changed.

@jsternberg
Copy link
Collaborator Author

This is a draft because there's more changes. The existing detect package ends up causing a conflict so I need to reimplement some of the functionality such as the delegated tracer and the trace recorder.

The big focus from this is mentioned in the description. OpenTelemetry lets the tracer and meter providers have multiple outputs and it would be helpful for some of the applications, like buildx, that use this to be able to register additional outputs that aren't part of the detection system. The detection system is mostly meant for user-facing tracing like otel, jaeger, and prometheus while some of the internal systems that need to exist always, like the delegated tracer, can be configured separately.

This should fix an issue with buildx where configuring a tracer will cause traces not to be sent to buildkit since the delegated tracer has a lower priority than the otel and jaeger ones. It'll also clean up some of that code to remove some of the global state that's a bit hard to follow. In particular, applications are expected to use detect.NewTracerProvider() instead of a global access function.

I'm going to work on the part of buildx that's meant to use this before I complete this PR just so it's a bit more obvious why this change is helpful.

Copy link
Contributor

@milas milas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I'll keep an eye out for the buildx PR as well, I think we'll be able to drop some code in Compose and standardize some of the end-user CLI OTel bootstrap

Comment on lines 947 to 950
if err := c.Shutdown(ctx); err != nil {
return err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this should always shutdown all closers [even if one fails] and then return an aggregate error

}

func Resource(ctx context.Context) (*resource.Resource, error) {
res, err := resource.Detect(ctx, serviceNameDetector{})
Copy link
Contributor

@milas milas Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think serviceNameDetector can go away entirely as part of this refactor and resource.Default() can be used directly (discovered this with @thaJeztah @ moby/moby#46830 (comment))

It looks like resource.Default() is sufficient:

(serviceNameDetector is 2+ years old, so I'm guessing it was written before the SDK supported the standard env var, but maybe there's additional subtlety here I'm unaware of)

See also:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The primary difference from what I understand is there's a slight difference in the logic for the default service name. For the environment variable, both will return the same thing. For the default name, you'll get this difference:

executable, err := os.Executable()
if err != nil {
return "unknown_service:go", nil
}
return "unknown_service:" + filepath.Base(executable), nil

Will cause the default service name to be the executable path.

return filepath.Base(os.Args[0]), nil

This one uses the base path. There doesn't seem to be a good way to modify how service.name gets set. If you merge it into the default resource, you overwrite the auto-discovered name. If you merge the default resource into it, this default seems to get populated. Although we might be able to use Default -> custom one -> Environment and that would likely work. I'll see if I can rework it to remove the serviceNameDetector.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's definitely possible to change this.

In particular, resource.New and then using the detector options like resource.WithFromEnv() and resource.WithTelemetrySDK().

}

func (o otlpMetricFactory) New() (sdkmetric.Reader, error) {
proto := getEnv("grpc", "OTEL_EXPORTER_OTLP_METRICS_PROTOCOL", "OTEL_EXPORTER_OTLP_PROTOCOL")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe OTel changed the default from gRPC -> http/protobuf, so it might make sense to prefer that here since metrics are a new addition?

(BuildKit tracing support was added at a time when gRPC was the default)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea looks like this is the case. We likely will need to be explicit with our own.

https://opentelemetry.io/docs/specs/otel/protocol/exporter/

I'm going to see if the semconv package has the defaults so we aren't using so many magic strings for this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no constant for the default in semconv (seems to be just for attribute names) and I don't see anything in the otlptrace package. I'm also a bit concerned this might qualify as a breaking change. I do think we should change it to keep the defaults correct according to the spec although we'll need to point it out in the release notes.

@jsternberg jsternberg changed the title detect: refactor detect package detect: refactor the detect package Mar 27, 2024
@jsternberg jsternberg force-pushed the detect-refactor branch 5 times, most recently from 76973cb to be13989 Compare March 29, 2024 19:41
@jsternberg jsternberg marked this pull request as ready for review April 1, 2024 20:19
@jsternberg jsternberg requested review from milas, tonistiigi and thaJeztah and removed request for milas April 1, 2024 20:20
This refactors the detect package with the goal of making it more
similar to otel's `autoexport` package and splitting out the additional
functionality used by buildkit, like the trace recorder and delegated
tracer, to more explicit processors rather than implicit through
`autoexport`.

This removes the global variables for the trace provider and meter
provider along with the global variable for the exporters. This is
replaced with functions that create the exporters. The delegated tracer
has been removed from detect and moved into the normal tracing util
package. This is still used by the command line to send delegated
traces, but it's an explicit exporter that's added rather than implicit.

Some functions have been renamed mostly to force dependent packages to
change their usage rather than have a chance at incorrect usage because
the semantics changed.

Signed-off-by: Jonathan A. Sternberg <jonathan.sternberg@docker.com>
@tonistiigi tonistiigi merged commit 1ffc790 into moby:master May 7, 2024
73 checks passed
@jsternberg jsternberg deleted the detect-refactor branch May 8, 2024 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants