-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented a "group by trace" processor. #1362
Implemented a "group by trace" processor. #1362
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1362 +/- ##
==========================================
+ Coverage 91.35% 91.49% +0.14%
==========================================
Files 240 245 +5
Lines 16744 17045 +301
==========================================
+ Hits 15297 15596 +299
- Misses 1044 1045 +1
- Partials 403 404 +1
Continue to review full report at Codecov.
|
@pjanotti, @bogdandrutu , @tigrannajaryan , would one of you please review this one? I marked it as draft for now, as there might be things to discuss, but from my perspective, this is ready for review. |
@owais , would you be able to look into this one, please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you expand on what the use case should be in the REAMDE (reading the linked issue looks like it's for tail based sampling).
Is this a common enough use case to go into core or should it go in contrib?
0258c4f
to
18f6be3
Compare
My idea is that once this processor gets stable enough, the tail-based sampler starts using it instead of having its own similar logic. |
I just realized that I left a change to |
29b9c15
to
bd80239
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few initial comments on the design.
Using pdata
model without conversion to OTLP might be better, and that will allow you will to use TraceID.String()
instead of hashing.
Based on your suggestion, the last commit changes the PR to use the internal pdata model instead of OTLP. The performance gain seems to be consistent and relevant: Before the last commit (f81445e), 3 benchmark runs:
After:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should handle the graceful shutdown properly, see relevant comment thread.
40e0c8e
to
74f8846
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall idea and high-level implementation make sense to me.
However, I personally find the code hard to read with 4 channels and 3 data structures manipulated in more than one place in different ways, but might be just me.
I'd probably prefer something like "event queue" with a few simple steps for each event type.
I would like one of the maintainers (Tigran or Bogdan) to do a review of this code as well.
Need to properly handle spans with different traces
a1e5229
to
6c2ac09
Compare
The last commit addresses the scenario where resource spans contain spans from multiple traces. There are only two points open: shutdown behavior and nil next consumer. |
The code looks good to me now, but need to check the remaining open questions with maintainers of the repo. |
This has been asked before by @jrcamp, and here's my answer:
|
The last commit also removed the nil check on the next consumer, as I agree that it doesn't really make sense for a processor to not have a next consumer. The only remaining point is about the shutdown then. @tigrannajaryan, @bogdandrutu, would you please help us on making a decision here? Basically:
|
@nilebox would it be OK if I create an issue to discuss the shutdown behavior, implementing a fix in a follow-up PR? |
@jpkrohling sure, I will approve this PR then once you create an issue. Still requires a maintainer to look at it and merge though :) |
Created #1465 |
b557243
to
a44919a
Compare
Closes open-telemetry#1309. Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>
a44919a
to
03995c6
Compare
I addressed @bogdandrutu's comment and squashed all the commits, since this was probably the last change in this PR. |
@nilebox , @bogdandrutu is there anything extra to be done with this PR? If not, would you mind merging it? |
Signed-off-by: Juraci Paixão Kröhling juraci@kroehling.de
Description: This change adds a new 'groupbytrace' processor, as defined on #1309. This addresses the first item in the action plan.
Link to tracking Issue: #1309
Testing: extensive unit tests have been added
Documentation: sample configuration + readme added