Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Have a repartition fallback for hash aggregates instead of sort #10370

Closed
revans2 opened this issue Feb 2, 2024 · 3 comments
Closed

[FEA] Have a repartition fallback for hash aggregates instead of sort #10370

revans2 opened this issue Feb 2, 2024 · 3 comments
Assignees
Labels
duplicate This issue or pull request already exists performance A performance related task/issue

Comments

@revans2
Copy link
Collaborator

revans2 commented Feb 2, 2024

Is your feature request related to a problem? Please describe.
We know that hash partitioning is faster than sorting. In GpuHashAggregate if the intermediate values get to be too large we end up sorting the intermediate results and then doing a final pass to output the answer.

In many cases it might be good to have a heuristic in place that sees if the output got smaller so we can decide if we do #7404 instead. But in the cases when we see the output is getting a lot smaller than the input or if it is not a partial aggregate, we probably should look at doing hash partitioning of the intermediate results instead of sorting the result.

@revans2 revans2 added feature request New feature or request ? - Needs Triage Need team to review and classify labels Feb 2, 2024
@mattahrens mattahrens added performance A performance related task/issue and removed feature request New feature or request ? - Needs Triage Need team to review and classify labels Feb 6, 2024
@binmahone binmahone self-assigned this May 23, 2024
@binmahone
Copy link
Collaborator

is this a duplicate with #8391 ?

@revans2
Copy link
Collaborator Author

revans2 commented Jun 12, 2024

Yup I filed the same thing twice. Feel free to dupe this to the other or vise versa.

@binmahone
Copy link
Collaborator

Closing this as it duplicates #8391

@sameerz sameerz added the duplicate This issue or pull request already exists label Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists performance A performance related task/issue
Projects
None yet
Development

No branches or pull requests

4 participants