-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add spilling support for aggregations with sorting/ordering. #7455
Labels
Comments
Sorted aggregations need to accumulate all input rows, then sort these within each group, then compute aggregations over sorted rows. For spilling purposes, we can Accumulator::extractFunction that returns an array of structs that represents all the input rows for a group. |
We can serialize rows as strings (VARBINARY) to save conversion to/from columnar format. |
This was referenced Nov 10, 2023
mbasmanova
added a commit
to mbasmanova/velox-1
that referenced
this issue
Nov 13, 2023
Summary: Add spillType field to Accumulator struct and use it to generate RowType for spilling. This allows to provide spilling support for Accumulators that do not correspond to aggregate functions, i.e. SortedAggregations. Part of facebookincubator#7455 Reviewed By: xiaoxmeng Differential Revision: D51230793 Pulled By: mbasmanova
facebook-github-bot
pushed a commit
that referenced
this issue
Nov 13, 2023
…7519) Summary: Add APIs to RowContainer to extract rows in serialized format. Will be used in spilling, initially in spilling of aggregation over sorted inputs. Part of #7455 Pull Request resolved: #7519 Reviewed By: xiaoxmeng Differential Revision: D51213589 Pulled By: mbasmanova fbshipit-source-id: 6b0d5fc03b7bb301ae229af509143d2a1c14ab55
facebook-github-bot
pushed a commit
that referenced
this issue
Nov 13, 2023
Summary: Add spillType field to Accumulator struct and use it to generate RowType for spilling. This allows to provide spilling support for Accumulators that do not correspond to aggregate functions, i.e. SortedAggregations. Part of #7455 Pull Request resolved: #7525 Reviewed By: xiaoxmeng Differential Revision: D51230793 Pulled By: mbasmanova fbshipit-source-id: 38b1a2a389e96bc90e2d3291dc15b9b2fef191d5
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
Currently we don't support spilling if aggregation nodes has aggregations with 'sorting/ordering' like this:
SELECT count(c0 ORDER BY c2) FROM tmp GROUP BY c1;
Need to add the support if we see queries with this are breaching memory limits.
The text was updated successfully, but these errors were encountered: