Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

otelgrpc: Add metrics support to NewServerHandler and NewClientHandler #4356

Merged

Conversation

fatsheep9146
Copy link
Contributor

fix #4316

@codecov
Copy link

codecov bot commented Oct 1, 2023

Codecov Report

Merging #4356 (da2f358) into main (23181f7) will decrease coverage by 0.2%.
The diff coverage is 70.9%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #4356     +/-   ##
=======================================
- Coverage   80.9%   80.8%   -0.2%     
=======================================
  Files        150     150             
  Lines      10313   10366     +53     
=======================================
+ Hits        8345    8376     +31     
- Misses      1827    1844     +17     
- Partials     141     146      +5     
Files Coverage Δ
...ion/google.golang.org/grpc/otelgrpc/interceptor.go 86.2% <100.0%> (ø)
...ogle.golang.org/grpc/otelgrpc/metadata_supplier.go 51.1% <0.0%> (ø)
...n/google.golang.org/grpc/otelgrpc/stats_handler.go 88.4% <76.5%> (-6.9%) ⬇️
...entation/google.golang.org/grpc/otelgrpc/config.go 69.6% <62.5%> (-5.0%) ⬇️

@fatsheep9146 fatsheep9146 force-pushed the otelgrpc-statshandler-support-metric branch from 2218a89 to 43226cb Compare October 7, 2023 05:31
@fatsheep9146 fatsheep9146 changed the title otelgrpc stats.Handler support metric otelgrpc grpc.StatsHandler support metric Oct 7, 2023
@fatsheep9146 fatsheep9146 marked this pull request as ready for review October 7, 2023 05:32
@fatsheep9146 fatsheep9146 requested a review from a team October 7, 2023 05:32
@fatsheep9146 fatsheep9146 force-pushed the otelgrpc-statshandler-support-metric branch 2 times, most recently from b149f52 to 57d33a1 Compare October 7, 2023 07:09
span.End()

metricAttrs = append(metricAttrs, rpcStatusAttr)
r.rpcDuration.Record(context.TODO(), int64(rs.EndTime.Sub(rs.BeginTime)), metric.WithAttributes(metricAttrs...))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we use the ctx passed to handleRPC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to use ctx, but it won't work, I'm still digging the reason.

Copy link
Contributor Author

@fatsheep9146 fatsheep9146 Oct 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the cause.

The reason is that, when we add instrument in HandleRPC function, when the rs stats.RPCStats is type of *stats.End and *stats.OutTrailer, the context we get is already cancelled by in grpc-go sdk.

https://github.com/grpc/grpc-go/blob/6e9c88b0acf15de2256c3d3a257869e0669572d9/internal/transport/http2_server.go#L1277

func (t *http2Server) finishStream(s *Stream, rst bool, rstCode http2.ErrCode, hdr *headerFrame, eosReceived bool) {
	// In case stream sending and receiving are invoked in separate
	// goroutines (e.g., bi-directional streaming), cancel needs to be
	// called to interrupt the potential blocking on other goroutines.
	s.cancel()

	oldState := s.swapState(streamDone)

And the metric sdk will check if ctx.Err() is nil when record the measurement
https://github.com/open-telemetry/opentelemetry-go/blob/c047088605b454f172765621d53107f80b1e6417/sdk/metric/instrument.go#L205

func (i *int64Inst) aggregate(ctx context.Context, val int64, s attribute.Set) { // nolint:revive  // okay to shadow pkg with method.
	if err := ctx.Err(); err != nil {
		return
	}
	for _, in := range i.measures {
		in(ctx, val, s)
	}
}

So the aggregate function will fail, the measurement will also be dropped.

So we should 1. use new context object. 2. add the measurement before *stats.End and *stats.OutTrailer

But some metric can only be measured when it comes to *stats.End and *stats.OutTrailer, for example duration, requests per rpc and responses per rpc. (Since in streaming model, we can only know the final count of requests and response when it comes to *stats.End). So I vote for the solution 1.

WDYT? @dashpole @pellared @Sovietaced

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the metric sdk will check if ctx.Err() is nil when record the measurement
https://github.com/open-telemetry/opentelemetry-go/blob/c047088605b454f172765621d53107f80b1e6417/sdk/metric/instrument.go#L205

I am not convinced that this what SDK should be doing. @MrAlias WDYT?

Copy link
Member

@pellared pellared Oct 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created open-telemetry/opentelemetry-go#4671.

For now it can be a context.TODO().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a better solution: open-telemetry/opentelemetry-go#4671 (comment)

@pellared
Copy link
Member

@fatsheep9146, will you still be able to work on this PR?

@pellared pellared changed the title otelgrpc grpc.StatsHandler support metric otelgrpc: Add metrics support to NewServerHandler and NewClientHandler Oct 27, 2023
@fatsheep9146
Copy link
Contributor Author

fatsheep9146 commented Oct 29, 2023

@fatsheep9146, will you still be able to work on this PR?

Sorry, last week I'm busy with my wedding.
I will continue to work on this issue today. @pellared

Copy link
Member

@pellared pellared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are very close 😉

Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
@fatsheep9146 fatsheep9146 force-pushed the otelgrpc-statshandler-support-metric branch from ad22f06 to e455fd2 Compare November 3, 2023 00:50
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
Copy link
Member

@pellared pellared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you again for your contribution 🎉

@pellared pellared merged commit 4c540e0 into open-telemetry:main Nov 6, 2023
20 of 21 checks passed
@Sovietaced
Copy link
Contributor

@pellared when do you think a release will be cut?

I've been waiting on this or #4356 for a couple months now

@pellared
Copy link
Member

pellared commented Nov 6, 2023

@pellared when do you think a release will be cut?

I've been waiting on this or #4356 for a couple months now

I hope that it is a matter of at most two weeks. See

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

otelgrpc: Add metrics support for stats handler
5 participants