-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: make histogramQuantile handle case of zero samples #5419
Conversation
Co-authored-by: Gavin Cabbage <gavincabbage@users.noreply.github.com>
return true | ||
} | ||
|
||
func (t *histogramQuantileTransformation) computeQuantile(cdf []bucket) (quantileResult, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue here was that the Flux stdlib function histogram_quantile
gets a wrong answer for some input data.
This function accepts a cumulative distribution function (a cumulative histogram produced from the input table data) and produces the requested quantile.
When the cdf contains all zeroes, this function would return the bound of the last histogram bucket, which is incorrect. The right thing to do for that case is to return a null value, since we can't compute a quantile if we didn't actually receive any observations.
// "force" is not possible because isMonotonic will fix the buckets | ||
return quantileResult{}, errors.Newf(codes.Internal, "unknown or unexpected value for onNonmonotonic: %q", t.spec.OnNonmonotonic) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes the histogram buckets are not monotonic (which they should be if they are cumulative) due to late-arriving data on the edge. The OnNonmonotonic
parameter describes what to do in this case.
Checking for monotonicity first (and fixing if needed and requested by the user) avoids a bug that occurred when the total observation count was pulled from the last bucket before it was "fixed" in the case of forcing monotonicity.
This is not really related to the issue the user found but I saw it here and fixed it. The test case histogramQuantileOnNonmonotonicForceLastBucket
below verifies this fix.
if totalCount == 0 { | ||
// Produce a null value if there were no samples | ||
return quantileResult{action: appendNil}, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is where we bail and produce a null value for the case of zero observations.
Closes #5415
When there are no observations/samples in a histogram (all zeros for each bucket) produce a
null
value.Checklist
Dear Author 👋, the following checks should be completed (or explicitly dismissed) before merging.
experimental/
docs/Spec.md
has been updatedDear Reviewer(s) 👋, you are responsible (among others) for ensuring the completeness and quality of the above before approval.