-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust DateHistogram's bucket accounting to be iteratively #101012
Adjust DateHistogram's bucket accounting to be iteratively #101012
Conversation
…during reduce instead accounting all buckets at the end of the reduce. In case of many non-empty buckets accounting the number of buckets at the end of the reduce may be too late. Elasticsearch may already have failed with an OOME. This change changes the accounting to happen iteratively during the reduce for non-empty bucket. Note that for empty buckets accounting of the number of buckets already happens iteratively.
Hi @martijnvg, I've created a changelog YAML for you. |
Pinging @elastic/es-analytics-geo (Team:Analytics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if we should only call #consumeBucketsAndMaybeBreak for final reduce or we can rely that for partial reduce it should be a NOOP. Otherwise it is a good change. I want to hear your thought and I will approve it.
@@ -323,6 +324,10 @@ protected boolean lessThan(IteratorAndCurrent<Bucket> a, IteratorAndCurrent<Buck | |||
// the key changes, reduce what we already buffered and reset the buffer for current buckets | |||
final Bucket reduced = reduceBucket(currentBuckets, reduceContext); | |||
if (reduced.getDocCount() >= minDocCount || reduceContext.isFinalReduce() == false) { | |||
if (consumeBucketCount++ >= REPORT_EMPTY_EVERY) { | |||
reduceContext.consumeBucketsAndMaybeBreak(consumeBucketCount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should only do this is is final reduce?
@@ -344,10 +349,14 @@ protected boolean lessThan(IteratorAndCurrent<Bucket> a, IteratorAndCurrent<Buck | |||
final Bucket reduced = reduceBucket(currentBuckets, reduceContext); | |||
if (reduced.getDocCount() >= minDocCount || reduceContext.isFinalReduce() == false) { | |||
reducedBuckets.add(reduced); | |||
if (consumeBucketCount++ >= REPORT_EMPTY_EVERY) { | |||
reduceContext.consumeBucketsAndMaybeBreak(consumeBucketCount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should only do this is is final reduce?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I look in other places and rely in the operation to be a noop for partial reduce.
Right in case of |
@elasticmachine run elasticsearch-ci/part-3 |
💚 Backport successful
|
…01012) Adjust DateHistogram's consumeBucketsAndMaybeBreak to be iteratively during reduce instead accounting all buckets at the end of the reduce. In case of many non-empty buckets accounting the number of buckets at the end of the reduce may be too late. Elasticsearch may already have failed with an OOME. This change changes the accounting to happen iteratively during the reduce for non-empty bucket. Note that for empty buckets accounting of the number of buckets already happens iteratively.
…101056) Adjust DateHistogram's consumeBucketsAndMaybeBreak to be iteratively during reduce instead accounting all buckets at the end of the reduce. In case of many non-empty buckets accounting the number of buckets at the end of the reduce may be too late. Elasticsearch may already have failed with an OOME. This change changes the accounting to happen iteratively during the reduce for non-empty bucket. Note that for empty buckets accounting of the number of buckets already happens iteratively.
Adjust Histogram's consumeBucketsAndMaybeBreak to be iteratively during reduce instead accounting all buckets at the end of the reduce. In case of many non-empty buckets accounting the number of buckets at the end of the reduce may be too late. Elasticsearch may already have failed with an OOME. This change changes the accounting to happen iteratively during the reduce for non-empty bucket. Note that for empty buckets accounting of the number of buckets already happens iteratively. Similar to elastic#101012
Adjust Histogram's consumeBucketsAndMaybeBreak to be iteratively during reduce instead accounting all buckets at the end of the reduce. In case of many non-empty buckets accounting the number of buckets at the end of the reduce may be too late. Elasticsearch may already have failed with an OOME. This change changes the accounting to happen iteratively during the reduce for non-empty bucket. Note that for empty buckets accounting of the number of buckets already happens iteratively. Similar to #101012
Adjust DateHistogram's consumeBucketsAndMaybeBreak to be iteratively during reduce instead accounting all buckets at the end of the reduce.
In case of many non-empty buckets accounting the number of buckets at the end of the reduce may be too late. Elasticsearch may already have failed with an OOME. This change changes the accounting to happen iteratively during the reduce for non-empty bucket.
Note that for empty buckets accounting of the number of buckets already happens iteratively.