Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracing: refactor the trace recorder #4756

Merged
merged 1 commit into from
Mar 29, 2024

Conversation

jsternberg
Copy link
Collaborator

The trace recorder is now a separate entity and does not wrap another span exporter. It is added as another span processor to the underlying tracer provider so the two can be disconnected from each other and this removes the need to use detect.Exporter() to find the TraceRecorder along with the need to invoke flush manually.

The trace recorder is wrapped with a simple span processor instead of the batch one. That makes the flushing irrelevant for this purpose.

The trace recorder itself is also modified to avoid leaking goroutines and respecting context cancellations.

This is part of a general refactor of the detect package to separate the buildkit-specific functionality from the external exporter detection.

@jsternberg jsternberg force-pushed the trace-recorder-refactor branch 2 times, most recently from 602bb11 to 3cf5e76 Compare March 13, 2024 20:36
@jsternberg jsternberg marked this pull request as ready for review March 26, 2024 16:26
type TraceRecorder struct {
sdktrace.SpanExporter
// mu is a channel that acts as a mutex for this struct.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mu means mutex type. Maybe you are looking for https://pkg.go.dev/golang.org/x/sync/semaphore#Weighted.Acquire ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea this will work fine.


spanCtx := trace.SpanContextFromContext(ctx)
if !spanCtx.IsValid() {
return emptyTraceSpans
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this stub instead of just returning nil like above?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original idea was because then this wouldn't trigger a "no trace recorder" error. I looked through the original path and it's probably just easier if we return nil everywhere than the stub empty function so I've removed this.


if err := r.lock(ctx); err != nil {
return emptyTraceSpans
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can exit out because of a context error, we should return that error.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

func (r *TraceRecorder) Shutdown(ctx context.Context) error {
if r.SpanExporter == nil {
// Initiate the shutdown of the gc loop.
r.shutdown(context.Canceled)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add errors.WithStack()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

func (r *TraceRecorder) Shutdown(ctx context.Context) error {
if r.SpanExporter == nil {
// Initiate the shutdown of the gc loop.
r.shutdown(context.Canceled)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add errors.WithStack()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deja vu github? If there's an area where I've missed adding errors.WithStack please tell me, but github seems to be showing me the same comment twice?

select {
case <-ctx.Done():
return
case now := <-ticker.C:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a difference in here from old code that under load this ticker will not start to slow down.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean. I believe ticker automatically drops ticks or slows down for slow receivers: https://pkg.go.dev/time#Ticker

Looking at the old code, it does appear that the old one would have 1 minute as the initial timer and then 50 seconds for every subsequent interval. I'm not sure if that was intentional or not.

mu chan struct{}

// shutdown function for the gc.
shutdown func(err error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name it shutdownGC

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@jsternberg jsternberg force-pushed the trace-recorder-refactor branch 2 times, most recently from 0c466e1 to a0eb0d1 Compare March 27, 2024 15:23
@jsternberg jsternberg requested a review from tonistiigi March 27, 2024 15:34
// lock is a binary semaphore for this struct.
// This is used instead of sync.Mutex because it allows
// for context cancellation to work properly.
lock *semaphore.Weighted
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would use a sem or sema name in here just so that there isn't any confusion about the Locker interface patterns.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

r.mu.Lock()
defer r.mu.Unlock()
func (r *TraceRecorder) Record(ctx context.Context) (func() []tracetest.SpanStub, error) {
// Just a simplification for usage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Just a simplification for usage."

That looks more like a review-help comment than code comment. As reader of code does not know the history.

"If the trace recorder was never initialized, return no function."

This sentence can be left in. Or it can be turned into function comment `// Record starts to record all the spans from the trace associated with the current span context and returns a function that returns all such spans when called. If the trace recorder was never initialized, return no function."

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@jsternberg jsternberg force-pushed the trace-recorder-refactor branch from a0eb0d1 to 22033ba Compare March 27, 2024 16:17
The trace recorder is now a separate entity and does not wrap another
span exporter. It is added as another span processor to the underlying
tracer provider so the two can be disconnected from each other and this
removes the need to use `detect.Exporter()` to find the TraceRecorder
along with the need to invoke flush manually.

The trace recorder is wrapped with a simple span processor instead of
the batch one. That makes the flushing irrelevant for this purpose.

The trace recorder itself is also modified to avoid leaking goroutines
and respecting context cancellations.

This is part of a general refactor of the detect package to separate the
buildkit-specific functionality from the external exporter detection.

Signed-off-by: Jonathan A. Sternberg <jonathan.sternberg@docker.com>
@jsternberg jsternberg force-pushed the trace-recorder-refactor branch from 22033ba to 0e1cf1c Compare March 27, 2024 21:00
@jsternberg jsternberg requested a review from tonistiigi March 27, 2024 21:00
@tonistiigi tonistiigi merged commit 69ad98e into moby:master Mar 29, 2024
72 checks passed
@jsternberg jsternberg deleted the trace-recorder-refactor branch March 29, 2024 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants