-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sampling #20
Sampling #20
Conversation
* hardcoded sampling rate of 50% * sampling decision should only be performed at root or honour parent span decision * allow more than probabilistic strategy for sampling * pretty sure i've not understood the model, code tidying to come
* change sample_rate to double * use bernoulli distribution to decide whether to sample
* inspired by the jaeger module we check whether any of the references are ChildOf to identify a parent span.
* propagation works but seems to be non-deterministic when incoming context is incomplete (i.e. just sampling header but no trace id)
* currently hardcoded to prob sample with fixed rate but will be replaced next
* parent context may be not-null but empty- not sufficient to just check if there's a parent span, must also check whether it's valid. * probabilistic sampler bounds p(sampled) between 0 and 1
@@ -566,6 +569,7 @@ class Span : public ZipkinBase { | |||
std::string name_; | |||
uint64_t id_; | |||
Optional<TraceId> parent_id_; | |||
bool sampled_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you default initialize sampled_ to true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely- just to check, do you mean via the default constructor of Span
or do you mean in the definition of sampled_
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in-class initialization is preferred for something like that: https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c48-prefer-in-class-initializers-to-member-initializers-in-constructors-for-constant-initializers
A lot of the code was copied in from envoy, so not all of it is following some of the conventions I'd use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the link- super helpful for me. I'll make the initialization change now.
@@ -271,17 +282,27 @@ class OtSpan : public ot::Span { | |||
class OtTracer : public ot::Tracer, | |||
public std::enable_shared_from_this<OtTracer> { | |||
public: | |||
explicit OtTracer(TracerPtr &&tracer) : tracer_{std::move(tracer)} {} | |||
explicit OtTracer(TracerPtr &&tracer) : tracer_{std::move(tracer)}, sampler_{std::move(new ProbabilisticSampler(1.0))} {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::move(new ProbabilisticSampler(1.0))
can be changed to new ProbabilisticSampler(1.0)
Could you run |
The tsan failure means you might be missing some locking somewhere. You can run the test locally and look at the log produced by cmake. Here's how it's run: Or maybe set up circleci to upload the log as an artifact like I was doing here with Lightstep: |
Cool- I'll do some more debugging. I can't replicate it locally (executing the tsan target doesn't yield anything) but that could be a quirk of my machine (MBP) and compiler I guess. I'll try and dig into it tomorrow. Thanks for your patience @rnburn. |
- normally this looks like it would be configured with the circleci config but this should work without repo write access
Finally got the output, 1 test fails: https://369-95814681-gh.circle-artifacts.com/0/Test.log
So appears to be something related to propagation. |
@rnburn not sure how to debug this- if I run the
But I don't see that error when I compile and run the tests on my main development MBP |
It might mean you're reading from uninitialized memory somewhere -- that could explain the unpredicatable behavior. You could try adding some print statements to trace what's going on. Or run in the debugger and set breakpoints at key locations. |
I think i've reduced it to a problem with how the |
I think I reduced the uninitialized memory to the flags in the span context (when constructed from a span). At this point I'm reasonably confident the work in this PR for the underlying lib has the necessary implementation for probabilistic sampling. I'm still trying to debug the issue when it's consumed by the nginx module- it seems that the |
return std::shared_ptr<ot::Tracer>{new OtTracer{std::move(tracer)}}; | ||
SamplerPtr sampler{new ProbabilisticSampler{options.sample_rate}}; | ||
return std::shared_ptr<ot::Tracer>{ | ||
new OtTracer{std::move(tracer), std::move(sampler)}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use std::make_shared here? https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rh-make_shared
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks, will do.
Can/should SamplerPtr
and TracerPtr
be replaced with std::make_unique
also? https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#c150-use-make_unique-to-construct-objects-owned-by-unique_ptrs
Thanks again for patiently sending me to the C++ idioms :)
zipkin_opentracing/src/sampling.h
Outdated
|
||
class ProbabilisticSampler : public Sampler { | ||
public: | ||
ProbabilisticSampler(double sample_rate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use explicit here: https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rc-explicit
@pingles - is this ready to merge? it's looking pretty good. |
:-) thanks!
I’d say yes but the only thing stopping me is figuring out why the
behaviour isn’t correct when using this lib within the nginx module.
Maybe I’ll see if I can do some more on it over the weekend/next week. I’m
99% certain it isn’t anything in this PR/lib but don’t want to rule out me
missing something?
…On Fri, 25 May 2018 at 17:34, Ryan ***@***.***> wrote:
@pingles <https://github.com/pingles> - is this ready to merge? it's
looking pretty good.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAEfi9FLN-Cfl4lVViFV1ix6DpHoR3sks5t2DKjgaJpZM4UHry1>
.
|
@rnburn pretty sure this is something inside the way I'm using the lib within the nginx module. am in the middle of poking around with lldb to debug and noticed this:
|
I changed all the nginx config so it didn't daemonize or fork a worker process to make it easier to debug the initialisation. Looks like
|
Ok figured it out- the nginx dynamic load was picking up the wrong object, but the headers included the Pointing it at the right location causes it to initialise correctly:
At this point I'd say I'm happy for the above to be merged as long as you're happy there's no other style/correctness tweaks you'd like @rnburn :) I'll create a separate PR for the nginx module that consumes these changes. |
The work in this PR adds a basic probabilistic sampler. Sampling decisions are propagated between contexts and only sampled spans are reported.
ZipkinOtTracerOptions
defaults asample_rate
to1.0
(always sample) but can be changed between 0 and 1.