Shutdown behavior for groupbyprocessor #1465

jpkrohling · 2020-07-31T14:10:26Z

When working on #1362, there was a discussion about what the shutdown behavior should be for the processor, regarding the in-flight traces:

what should happen with in-flight traces when the processor is shutting down? Should we just discard them? Should we flush them immediately, blocking until they are consumed, possibly honoring the context's deadline?

To unblock that PR, this issue here was created, so that the appropriate solution is agreed on, to be implemented as a follow-up PR.

jpkrohling · 2020-09-02T16:03:10Z

This should also be assigned to me.

jpkrohling · 2020-09-03T07:36:18Z

cc @nilebox, as you reviewed the original PR.
cc @pjanotti, @bogdandrutu and @tigrannajaryan for your opinions.

The groupbyprocessor's premise is to release traces to the next processor only when a trace has been completed. It's unclear what should be the behavior during the shutdown: should in-memory traces be discarded, as they are potentially incomplete? Or should them all be flushed, to give them a chance of being persisted, even if partially?

tigrannajaryan · 2020-09-03T19:17:26Z

I believe the desirable shutdown behavior is to stop receiving new data, drain all in-memory data from the pipeline, flush the exporters and exit the process. For groupbyprocessor this would mean flushing the accumulated (even if incomplete) traces to the next consumer. I do not think there is an expectation to wait for a trace to be complete before shutdown is complete.

jpkrohling · 2020-09-04T08:07:30Z

I think that's in line with @nilebox's opinions. I'll work on doing this then.

* Drain the queue upon shutdown, with a time limit. Fixes open-telemetry#1465. * Added metrics to the groupbyprocessor, making it easier to understand what's going on in case of problems. See open-telemetry#1811. * Changes the in-memory storage to unlock its RLock when the method returns. Fixes open-telemetry#1811. Link to tracking Issue: open-telemetry#1465 and open-telemetry#1811 Testing: unit + manual tests Documentation: see README.md Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>

…ring shutdown. (#1842) Fixed deadlock in groupbytrace processor. * Drain the queue upon shutdown, with a time limit. Fixes #1465. * Added metrics to the groupbyprocessor, making it easier to understand what's going on in case of problems. See #1811. * Changes the in-memory storage to unlock its RLock when the method returns. Fixes #1811. Link to tracking Issue: #1465 and #1811 Testing: unit + manual tests Documentation: see README.md Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>

…y#1465) Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.34.0 to 1.35.0. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](grpc/grpc-go@v1.34.0...v1.35.0) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Tyler Yahn <MrAlias@users.noreply.github.com>

jpkrohling added the bug Something isn't working label Jul 31, 2020

jpkrohling mentioned this issue Jul 31, 2020

Implemented a "group by trace" processor. #1362

Merged

tigrannajaryan added this to the GA 1.0 milestone Aug 12, 2020

tigrannajaryan added the help wanted Good issue for contributors to OpenTelemetry Service to pick up label Sep 2, 2020

tigrannajaryan assigned jpkrohling Sep 2, 2020

jpkrohling mentioned this issue Sep 24, 2020

Add metrics to groupbytraceprocessor, wait for queue to be drained during shutdown. #1842

Merged

tigrannajaryan closed this as completed in #1842 Oct 1, 2020

Troels51 pushed a commit to Troels51/opentelemetry-collector that referenced this issue Jul 5, 2024

Fix trace kIsSampled flag set incorrectly (open-telemetry#1465)

4c08919

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shutdown behavior for groupbyprocessor #1465

Shutdown behavior for groupbyprocessor #1465

jpkrohling commented Jul 31, 2020

jpkrohling commented Sep 2, 2020

jpkrohling commented Sep 3, 2020

tigrannajaryan commented Sep 3, 2020 •

edited

Loading

jpkrohling commented Sep 4, 2020

Shutdown behavior for groupbyprocessor #1465

Shutdown behavior for groupbyprocessor #1465

Comments

jpkrohling commented Jul 31, 2020

jpkrohling commented Sep 2, 2020

jpkrohling commented Sep 3, 2020

tigrannajaryan commented Sep 3, 2020 • edited Loading

jpkrohling commented Sep 4, 2020

tigrannajaryan commented Sep 3, 2020 •

edited

Loading