-
Notifications
You must be signed in to change notification settings - Fork 582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak caused by otelhttp
when no Meter Provider is configured.
#5190
Comments
We also ran into this issue at Arista. |
see open-telemetry/opentelemetry-go-contrib#5190 Signed-off-by: Kenny Leung <kleung@chainguard.dev>
Sorry I didn't see this the first time around. Taking a look now |
I am able to reproduce the issue with these benchmarks: package bench
import (
"testing"
"go.opentelemetry.io/otel"
)
const (
meterName = "foo"
metricName = "bar"
)
func BenchmarkSimple(b *testing.B) {
b.ReportAllocs()
meter := otel.Meter(meterName)
for i := 0; i < b.N; i++ {
meter.Int64Counter(metricName)
}
} It has 1 allocation. Since we are repeatedly creating the same instrument, it should have 0 allocations. If I replace We should instead maintain a map of instruments, and ensure that we only keep unique instruments, similar to what we did for the SDK. |
The workaround provided above is correct. |
Looking back at my notes when I wrote this ticket - I came to the same conclusion so I think you're on the right lines here. Thank you for confirming the issue and looking into it 🙏 |
Fixed by open-telemetry/opentelemetry-go#5754 |
I am pretty sure this is still an issue or something else in the golang otel ecosystem. I will get a pprof setup possibly tomorrow, but here's some anecdotal evidence I have: Pretty easy to see when tracing was implemented from that graph. And yes. I have removed our tracing implementation and it's back to normal memory usage. Here is a rough draft of our setup. Please let me know if I am doing anything egregiously dumb, but for the most part, it's all pretty standard stuff take from various docs:
We are wrapping the otelhttp.NewHandler around the toplevel muxer, so everything is traced. Yes, I know this is expensive, but it shouldn't leak memory. Eventually we will change this to include/exclude/drop stuff, just so we aren't taking in so much volume (ping routes, health checks, etc.) and do more aggressive down sampling. func NewApi(c config.Config) *Api {
return &Api{
c: &c,
controllers: []Controller{newControllers(c)},
server: &http.Server{
ReadTimeout: c.ReadTimeout,
WriteTimeout: c.WriteTimeout,
IdleTimeout: c.IdleTimeout,
Addr: fmt.Sprintf(":%d", c.Port),
Handler: otelhttp.NewHandler(chi.NewMux(), "INGRESS", otelhttp.WithFilter(traceFilter)),
},
done: make(chan bool),
sigChannel: make(chan os.Signal, 1024),
}
} Here is how we are initializing our trace and metrics providers once on boot: // TracerProvider an OTLP exporter, and configures the corresponding trace provider.
func TracerProvider(ctx context.Context, res *resource.Resource) (func(context.Context) error, error) {
// Set up a trace exporter
traceExporter, err := otlptrace.New(ctx, otlptracegrpc.NewClient())
if err != nil {
return nil, errors.Wrap(err, "failed to create trace exporter")
}
// Register the trace exporter with a TracerProvider, using a batch
// span processor to aggregate spans before export.
tracerProvider := sdktrace.NewTracerProvider(
sdktrace.WithSampler(sdktrace.AlwaysSample()),
sdktrace.WithResource(res),
sdktrace.WithBatcher(traceExporter),
)
otel.SetTracerProvider(tracerProvider)
otel.SetTextMapPropagator(
propagation.NewCompositeTextMapPropagator(
propagation.TraceContext{},
propagation.Baggage{},
))
// Shutdown will flush any remaining spans and shut down the exporter.
return tracerProvider.Shutdown, nil
}
// MeterProvider an OTLP exporter, and configures the corresponding meter provider.
func MeterProvider(ctx context.Context, res *resource.Resource) (func(context.Context) error, error) {
metricExporter, err := otlpmetricgrpc.New(ctx)
if err != nil {
return nil, errors.Wrap(err, "failed to create metric exporter")
}
meterProvider := sdkmetric.NewMeterProvider(
sdkmetric.WithReader(sdkmetric.NewPeriodicReader(metricExporter)),
sdkmetric.WithResource(res),
)
otel.SetMeterProvider(meterProvider)
return meterProvider.Shutdown, nil
} Then called and shutdown on main: shutDownTracer, err := traceinstrument.TracerProvider(ctx, traceRes)
if err != nil {
log.Logger.Fatal("failed to create trace provider", zap.Error(err))
}
defer func(onShutdown func(ctx context.Context) error) {
if errr := onShutdown(ctx); errr != nil {
log.Logger.Error("error shutting down trace provider", zap.Error(errr))
}
}(shutDownTracer)
shutdownTraceMetrics, err := traceinstrument.MeterProvider(ctx, traceRes)
if err != nil {
log.Logger.Fatal("failed to create meter provider", zap.Error(err))
}
defer func(onShutdown func(ctx context.Context) error) {
if errr := onShutdown(ctx); errr != nil {
log.Logger.Error("error shutting down metrics provider", zap.Error(errr))
}
}(shutdownTraceMetrics) Note. We are also using the otelhttp.NewTransport to wrap the default logging transport: http.DefaultTransport = otelhttp.NewTransport(http.DefaultTransport) If we remove tracing setup, memory usage goes back to normal. So the leak is definitely in our tracing setup. |
You should open a new issue for this, as this isn't the same issue since you're configuring a meter provider. |
Description
When calling
otelhttp.NewTransport()
multiple times, memory is leaked.Supposedly this was resolved by open-telemetry/opentelemetry-go#4820 but I do not think this fixes the issue when no meter provider has been configured and the default global delegating meter provider is in place.
Related: #4226
Introduced by: #4707
Environment
Steps To Reproduce
Run each of these with
test.memprofile
and analyze the inuse heap objects/space difference between them.Alternatively, write an application which calls
otelhttp.NewTransport
and analyze heap usage.Expected behavior
Calling otelhttp.NewTransport() repeatedly doesn't leak memory
Workaround
Call
otel.SetMeterProvider(noop.MeterProvider{})
at the start of your application.The text was updated successfully, but these errors were encountered: