Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grouping in range vector #2884

Closed
cyriltovena opened this issue Nov 5, 2020 · 4 comments · Fixed by #3030
Closed

Grouping in range vector #2884

cyriltovena opened this issue Nov 5, 2020 · 4 comments · Fixed by #3030
Assignees

Comments

@cyriltovena
Copy link
Contributor

LogQL v2 introduces new range vector aggregation over labels values instead of logs using unwrap. Values that can be extracted from logs and so high cardinality.

We added by/without grouping clause on non-associative operations (avg_over_time, quantile_over_time, stddev_ove...).

We didn't added this for rate,count_over_time, bytes_rate and even max_over_time, because Prometheus doesn't have this.

This would allow to:

  • Reduce labels required at the edges (ingester, querier storage) and so improve performance.
  • We don't need to parse all labels now.
  • We can write query quicker rate() by (cluster), instead of sum by (cluster) (rate())
@owen-d
Copy link
Member

owen-d commented Nov 18, 2020

Two things:

  1. I think the "auto parsers" like json, logfmt should allow optional parameters like | json foo bar which would only parse out the desired fields (foo and bar in this case). This would help reduce cardinality when desired and it occurs at the edge (queriers).
  2. I'd like to be able to group by () in <agg>_over_time because we have no other way to run these operations and reduce all results to a single series.

@cyriltovena cyriltovena self-assigned this Nov 20, 2020
cyriltovena added a commit to cyriltovena/loki that referenced this issue Dec 3, 2020
…rations.

This essentially allows to aggregate over all dimensions when using by () while without() is a noop.
This also makes it possible for max and min range vector aggregation to use grouping, which is simpler than doing max by (foo) max_over_time(...)

Examples:
- `max_over_time(...) by ()` gives your the max over time accross of all series.
- `min_over_time(...) by (namespace)` gives you the min over time per namespace.
- `max_over_time(...) without (namespace)` gives you the max over time removing the namespace dimension.

PS: I've also refactored a bit how we  optimized grouping to make it more clear.

Fixes grafana#2884

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
cyriltovena added a commit that referenced this issue Dec 7, 2020
)

* Allows by/without to be empty and available for max/min_over_time operations.

This essentially allows to aggregate over all dimensions when using by () while without() is a noop.
This also makes it possible for max and min range vector aggregation to use grouping, which is simpler than doing max by (foo) max_over_time(...)

Examples:
- `max_over_time(...) by ()` gives your the max over time accross of all series.
- `min_over_time(...) by (namespace)` gives you the min over time per namespace.
- `max_over_time(...) without (namespace)` gives you the max over time removing the namespace dimension.

PS: I've also refactored a bit how we  optimized grouping to make it more clear.

Fixes #2884

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* nit doc.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
@chancez
Copy link
Contributor

chancez commented Aug 18, 2022

You still can't group by in a sum(count_over_time(...)) by (...) so is there an alternative approach or should this be re-opened?

@CrazyByDefault
Copy link

currently stuck trying to get a query to do this exact thing -

sum by (path) (count_over_time({some logql} | json | status!=200 | status!=404 | unwrap path [$__interval]))

@patsevanton
Copy link
Contributor

For example:

sum(count_over_time({cluster=~".+"}[1m])) by (cluster)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants