Skip to content
This repository has been archived by the owner on Aug 30, 2022. It is now read-only.

Fix overflow bug. #13

Merged
merged 3 commits into from
Nov 22, 2017
Merged

Fix overflow bug. #13

merged 3 commits into from
Nov 22, 2017

Conversation

rnburn
Copy link
Contributor

@rnburn rnburn commented Nov 17, 2017

This is a way of fixing #6 that keeps the same sampling logic. If long double is an 80-bit floating point type (which it is on most platforms), it will determine the sampling boundary in a more accurate way where overflow wouldn't be a problem; but it also adds a conditional to protect against overflow in case long double is a standard 64 bit floating point type.

Signed-off-by: Ryan Burn <ryan.burn@gmail.com>
Signed-off-by: Ryan Burn <ryan.burn@gmail.com>

static uint64_t computeSamplingBoundary(long double samplingRate)
{
const long double maxRandNumber = kMaxRandomNumber;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. As you have pointed out, long double is not a great solution. There are only two fixes I would really consider. The first, use a library to handle this sort of thing such as Google's double-conversion. I'd hate to add more dependencies just over this operation. The second possible solution is to reconsider how we sample the traces and use a new random number as opposed to the trace ID. The only issue with that is currently we have a solution that is somewhat repeatable across all clients. The same trace ID is sampled or not sampled no matter which language is being used, etc. But using a new random number means we cannot guarantee that a trace will be sampled across the board. Not sure how important this is, but something to consider.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarification - the trace ID is used for sampling in the Go client merely as a performance optimization. There's no functional dependency on doing it that way. In the Node.js client we do use a new random number for the sampling decision since Node cannot do uint64 arithmetics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Makes sense. That is okay with me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isaachier This does not require long double to be 80-bit to work.

All you need to handle the overflow is a conditional like this.

long double will still give you a better answer if extended precision is available, but it will still work if long double == double.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be >=. And you wrote yourself that using a standard random number generator would make more sense here. Do you no longer feel that way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since samplingRate <= 1.0, samplingRate * maxRandNumber is always less than or equal to maxRandNumber; so it's not necessary.

Using std::uniform_real_distribution would be straightforward, but I don't see anything wrong with putting in a small fix to immediately address the overflow issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK got it. Overall, I hate dealing with overflow, but the random number generator involves changing a number of unit tests as well. So I am happy to accept this, but if you can (again), please merge from master and push so we can see coverage info.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merged master.

@codecov
Copy link

codecov bot commented Nov 22, 2017

Codecov Report

Merging #13 into master will decrease coverage by <.01%.
The diff coverage is 85.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #13      +/-   ##
==========================================
- Coverage   85.44%   85.43%   -0.01%     
==========================================
  Files          93       93              
  Lines        2219     2225       +6     
  Branches      202      203       +1     
==========================================
+ Hits         1896     1901       +5     
  Misses        252      252              
- Partials       71       72       +1
Impacted Files Coverage Δ
src/jaegertracing/samplers/ProbabilisticSampler.h 87.5% <85.71%> (-2.5%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 60de3d3...191780f. Read the comment docs.

@isaachier
Copy link
Contributor

In my experience very hard to write a test case for overflow so just going to accept this despite coverage decrease. Thanks for the update.

@isaachier isaachier merged commit a1bea40 into jaegertracing:master Nov 22, 2017
rnburn referenced this pull request in pingles/zipkin-cpp-opentracing May 16, 2018
* hardcoded sampling rate of 50%
* sampling decision should only be performed at root
  or honour parent span decision
* allow more than probabilistic strategy for sampling
* pretty sure i've not understood the model, code tidying to come
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants