Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/groupbytrace] Leaked goroutine in doWithTimeout #32572

Closed
crobert-1 opened this issue Apr 19, 2024 · 8 comments
Closed

[processor/groupbytrace] Leaked goroutine in doWithTimeout #32572

crobert-1 opened this issue Apr 19, 2024 · 8 comments
Assignees
Labels
bug Something isn't working closed as inactive processor/groupbytrace Group By Trace processor Stale

Comments

@crobert-1
Copy link
Member

Component(s)

processor/groupbytrace

Describe the issue you're reporting

The method doWithTimeout is used to run a method with a timeout. However, if the method being called takes longer than the timeout, there's no way to cancel it if it's blocking or running longer than the collector itself. This results in a leaked goroutine.

Ideally there would be some way to shutdown down the called function. This would probably require some kind of enforced function format where the function could be shutdown by a context cancel or something along those lines. I don't know if this is possible or not in the processor.

goleak output:

=== RUN   TestDoWithTimeout_TimeoutTrigger
--- PASS: TestDoWithTimeout_TimeoutTrigger (0.02s)
PASS
goleak: Errors on successful test run: found unexpected goroutines:
[Goroutine 23 in state sleep, with time.Sleep on top of the stack:
time.Sleep(0x3b9aca00)
	/Users/crobert/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.9.darwin-amd64/src/runtime/time.go:195 +0x125
github.com/open-telemetry/opentelemetry-collector-contrib/processor/groupbytraceprocessor.TestDoWithTimeout_TimeoutTrigger.func1()
	/Users/crobert/dev/contrib/goleak/processor/groupbytraceprocessor/event_test.go:508 +0x18
github.com/open-telemetry/opentelemetry-collector-contrib/processor/groupbytraceprocessor.doWithTimeout.func1()
	/Users/crobert/dev/contrib/goleak/processor/groupbytraceprocessor/event.go:343 +0x22
created by github.com/open-telemetry/opentelemetry-collector-contrib/processor/groupbytraceprocessor.doWithTimeout in goroutine 22
	/Users/crobert/dev/contrib/goleak/processor/groupbytraceprocessor/event.go:342 +0x88
]
@crobert-1 crobert-1 added needs triage New item requiring triage bug Something isn't working labels Apr 19, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1
Copy link
Member Author

Unassigning myself, I found this while working on the related PR, but the PR does not resolve this issue and I'd prefer to hear from code owners to know the best path forward given the use cases of the method in question. 👍

@crobert-1 crobert-1 removed their assignment Apr 22, 2024
codeboten added a commit that referenced this issue Apr 26, 2024
Enable `goleak` checks on the `groupbytrace` processor to help ensure no
goroutines are being leaked. I filed
#32572
to address an existing leak that needs more discussion.

**Link to tracking Issue:** #32572
#30438

---------

Co-authored-by: Alex Boten <223565+codeboten@users.noreply.github.com>
@jpkrohling jpkrohling self-assigned this Apr 30, 2024
Copy link
Contributor

github-actions bot commented Jul 1, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Jul 1, 2024
@crobert-1 crobert-1 removed the Stale label Jul 1, 2024
@jpkrohling
Copy link
Member

I think the main problem is that the callbacks are not receiving context with a timeout so the blocking call won't be canceled. I'd need to refactor this component in order to implement this, but this is low on my priority list. Do you want to give this a shot?

@crobert-1
Copy link
Member Author

I'm not sure I'd be able to get to it any time soon, but I appreciate the guidance here. Agreed on the suggested path forward. 👍

Copy link
Contributor

github-actions bot commented Sep 9, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Sep 9, 2024
@jpkrohling jpkrohling removed the Stale label Sep 9, 2024
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Nov 11, 2024
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working closed as inactive processor/groupbytrace Group By Trace processor Stale
Projects
None yet
Development

No branches or pull requests

3 participants