Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update bucket metric pipeline agg paths to allow intermediate single bucket and bucket qualified multi-bucket aggs #85729

Merged

Conversation

benwtrent
Copy link
Member

Bucket metric pipeline aggregation paths currently always expect the sibling aggregation to be a multi-bucket agg.

This honestly doesn't have to be the case for bucket metric pipeline aggs to work.

Consider the following path:

filter_agg>filters_agg['bucket_foo']>histo>some_metric_agg

Since filter_agg>filters_agg['bucket_foo'] are well defined and are not crossing bucket threasholds, metrics should still be able to be calculated against the bucket values for histo

This commit allows any combination of single bucket aggs (e.g. filter) and bucket specific multi-bucket aggs before reaching the desired multi-bucket used for the metric calculation.

…bucket and bucket qualified multi-bucket aggs
@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Apr 6, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @benwtrent, I've created a changelog YAML for you.

Copy link
Member

@not-napoleon not-napoleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to think about this one a bit more, will take another pass at it tomorrow. Here's some initial feedback.

(aggBuilder.get().bucketCardinality() == AggregationBuilder.BucketCardinality.MANY
&& AggregationPath.pathElementContainsBucketKey(currentAgg))
// Is this a single bucket agg with sub-aggs?
|| (aggBuilder.get().bucketCardinality() != AggregationBuilder.BucketCardinality.MANY
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this branch check if it's exactly BucketCardinality.ONE? We can't path further into an agg with BucketCardinality.NONE, unless I'm missing something?

// Dig through the aggregation tree to find the first aggregation specified by the path.
// The path may have many single bucket aggs (with sub-aggs) or many multi-bucket aggs specified by bucket keys
while (aggBuilder.isPresent() && pathPos < path.size() && (
// is this a multi-bucket agg with a bucket key specified?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formatting here is pretty harsh. I've been told that having the comments inside the conditional like this may be confusing the formatter, can we try putting them before the while loop and see if this is any better? I'm open to other ideas too, but this is a mess to read right now.

Aggregation currentAgg = aggregation;
while (currElement < parsedPath.size() - 1) {
if (currentAgg == null) {
throw new AggregationExecutionException(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be IllegalArgumentExceptions, which render as 400s. Or we can change how AggregationExecutionException reports its status code, but that seems out of scope. I've been meaning to overhaul how we communicate errors...

@benwtrent benwtrent requested a review from not-napoleon May 3, 2022 19:46
Copy link
Member

@not-napoleon not-napoleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

@benwtrent
Copy link
Member Author

@elasticmachine update branch

@benwtrent benwtrent added v8.4.0 and removed v8.3.0 labels May 25, 2022
@benwtrent
Copy link
Member Author

@elasticmachine update branch

@benwtrent
Copy link
Member Author

@elasticmachine update branch

@benwtrent benwtrent merged commit b4668ca into elastic:master May 25, 2022
@benwtrent benwtrent deleted the feature/refactor-buckets-path-resolver branch May 25, 2022 19:25
salvatore-campagna pushed a commit to salvatore-campagna/elasticsearch that referenced this pull request May 26, 2022
…bucket and bucket qualified multi-bucket aggs (elastic#85729)

Bucket metric pipeline aggregation paths currently always expect the sibling aggregation to be a multi-bucket agg. 

This honestly doesn't have to be the case for bucket metric pipeline aggs to work. 

Consider the following path:

```
filter_agg>filters_agg['bucket_foo']>histo>some_metric_agg
```

Since `filter_agg>filters_agg['bucket_foo']` are well defined and are not crossing bucket threasholds, metrics should still be able to be calculated against the bucket values for `histo`

This commit allows any combination of single bucket aggs (e.g. filter) and bucket specific multi-bucket aggs before reaching the desired multi-bucket used for the metric calculation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.4.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants