Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(opentelemetry-sampler-aws-xray): add x-ray remote sampler #1443
feat(opentelemetry-sampler-aws-xray): add x-ray remote sampler #1443
Changes from 3 commits
ff23c5c
e838b2c
ceee8cc
aa01262
e5b313b
d63ac79
2f43339
0cc5d27
25ac535
fada495
3e3388f
3a61f4c
0dc2a36
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a note to README saying this component is still in development. You can remove this note when the component is fully implemented
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This timer will live forever and prevent the application from stopping since it always has something to do.
You need to do either of the following:
clearInterval
use the clearInterval function to stop the interval when application goes down. I guess it will require adding
shutdown()
method to the sampler and for users to notify the object. Not sure about browser compatibilityunref
You can use the unref timer function to make sure the timer will never hold the event loop if nothing else needs it.
Not sure if you can
unref
an interval (or only timeout). You can switch to timeout implementation as well instead of interval if this is the case.Anyway, no matter what you choose, please run a simple test to make sure application goes down nicely when sampler is in use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The x-ray remote sampler requires that the
/GetSamplingRules
endpoint is called every 5 minutes (or user-defined interval), and it is an essential component to ensuring the sampling behaves correctly. This behavior is implemented in a separate worker thread in our Go and DotNet implementations, but since JS is a single-threaded language it is preferred to use async calls through intervals or timeouts.Our customer experience is centered around creating sampling rules in the X-Ray console, which will be fetched periodically using the
/GetSamplingRules
endpoint. If we want to stop the sampler, then the rules would be removed from the X-Ray console and the Sampler would be removed from theTracerProvider
.Please correct me if I misunderstood this part, but I believe that the snippet below from our documentation on initializing components to send traces to X-Ray would be used to make sure the application exits nicely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I understand the need and support it, but I am just pointing out a problem with the current implementation that is not acceptable for end users.
The node application can terminate in a few ways:
Signal
In this case, I think the application will go down neither the less, and
sdk.shutdown
is more about exporting unsent spans from batch processors before terminating the application. please note that can still be nice to release acquired resources like timers but I don't think it's essential.Please also not that
sdk.shutdown()
invokestracerProvider.shutdown()
which only callsshutdown
on span processors. So the shutdown is important, but it will not trigger any cleanups on samplers.No more events on event loop
As a simple toy example, consider this simple js application which is not doing too much:
If we run this with node it prints and terminates immediately:
But if we had a
setInterval
in this code, it would now run forever:When you register a
setInterval
in user code, you are keeping the event loop busy forever which will alter the behavior of the some applications.If you instead
unref
this timer, it would signal node that this timer should not hold the event loop alive if nothing else needs it:This practice is used in BatchSpanProcessorBase because of the above reasons, and in contrib repo here and here.
Basically, if you are a library that wishes to be transparent to the end-user and not have any side effects, you should
unref
every timer you set up.The code above just prints to the console as an example, but any "script" that simply executes a js file with sequential commands until it reaches the end will now never complete, and the terminal will hang forever fetching the config endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can create a trace for this internal otel HTTP invocation. I guess most users don't care to record and pay for these spans, as they are not applicative spans. Similar to how we avoid recording exporter http calls.
You can suppress instrumentation for this call with the suppressTracing function. This repo has many examples on how to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it valid that
responseJson
is nullish (e.g. there is no data in response). Should we log something to diag in this case?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe if response is null then
response.data
will throw an error, which will lead to this warning being logged in the catch phrase: