-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added timeout to the exporter path #1201
Conversation
This PR is to start a discussion about the right way to approach this problem. Should the timeout be configurable? If so, how? |
Codecov Report
@@ Coverage Diff @@
## master #1201 +/- ##
==========================================
- Coverage 87.56% 87.52% -0.05%
==========================================
Files 203 203
Lines 14555 14568 +13
==========================================
+ Hits 12745 12750 +5
- Misses 1370 1376 +6
- Partials 440 442 +2
Continue to review full report at Codecov.
|
Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>
cfcc05c
to
5c5a2f5
Compare
@@ -71,7 +72,7 @@ func (s *protoGRPCSender) pushTraceData( | |||
|
|||
batches, err := jaegertranslator.InternalTracesToJaegerProto(td) | |||
if err != nil { | |||
return td.SpanCount(), consumererror.Permanent(err) | |||
return td.SpanCount(), consumererror.Permanent(errors.Wrap(err, "failed to push trace data via Jaeger exporter")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC more used is now fmt.Errorf
within the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like people prefer the Wrap, we should converge to that :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had fmt.Errorf
but noticed that the code base seems to use errors.Wrap
. I don't have a preference, to be honest :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use fmt.Errorf
with %w
as we've moved to Go 1.14 and the new errors package is available now. We can actually get rid of pkg/errors
package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nice thing about errors.Wrap
is it's type safe and you don't need to add ...: %w"
to every single fmt.Errorf
call wrapping another error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See example https://play.golang.org/p/t9GxkVpiIhA
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except the use of timeout in fanoutconnector. I think we should have this logic probably in queued_retry or in every exporter.
@@ -25,6 +26,8 @@ import ( | |||
"go.opentelemetry.io/collector/internal/data" | |||
) | |||
|
|||
const timeout = 5 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fanoutconnector is used in multiple places (between different components) and we should not have the logic of timeout here.
Adding it to every exporter might still cause misbehaving exporters to block the pipeline. Adding to the queued retry could be a solution, but this processor isn't applied by default, right? I think the collector needs a protection against deadlocks/misbehaving processors/exporters. Ideally, as someone managing a collector instance, I would probably prefer to set a deadline for the entire pipeline (or processors, or exporters), as opposed to having a timeout per exporter. |
@jpkrohling let's try to make progress on this:
What do you think? |
Done: #1259 |
@bogdandrutu, any ideas on how to move forward with the deadline? |
@jpkrohling if I add it to the queue retry is that good enough? Also currently this is not a deadlock, just a stuck operation :) |
Are all users instructed to use the queued retry? If so, sure. The main idea is to cancel the execution of a pipeline component taking an unreasonable amount of time to complete.
Right, I meant cases in the future where we might get affected by new bugs ;-) |
@jpkrohling in #1386 I added timeout (5s) for all the exporters that use the new interface (only one that does not is zipkin). |
Closing in favor of #1386 . |
…orter (open-telemetry#1201) Based on the spec change: open-telemetry/opentelemetry-specification#967
…y#1201) Bumps [boto3](https://github.com/boto/boto3) from 1.20.48 to 1.20.49. - [Release notes](https://github.com/boto/boto3/releases) - [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst) - [Commits](boto/boto3@1.20.48...1.20.49) --- updated-dependencies: - dependency-name: boto3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Juraci Paixão Kröhling juraci@kroehling.de
Description:
Warn
instead, in line with the queued processorLink to tracking Issue: Resolves #1193
Testing: manual tests, as per #1193
Documentation: pending