Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Abandoned] Optimize DistinctLimit for small limit values #17374

Closed
wants to merge 1 commit into from

Conversation

kaikalur
Copy link
Contributor

@kaikalur kaikalur commented Feb 28, 2022

We now introduce a notion of "fast limits" for small values of limit (default threshold 10k). As a first optimization, we disable hash generation optimization for simple distinct limit operations so that the exchange won't block and the user will start seeings results as soon as the first value is seen and/or the number of distinct values is less than the limit. This helps in interactive querying use cases.

#17328

Test plan - N/A

Please make sure your submission complies with our Development, Formatting, and Commit Message guidelines. Don't forget to follow our attribution guidelines for any code copied from other projects.

Fill in the release notes towards the bottom of the PR description.
See Release Notes Guidelines for details.

== RELEASE NOTES ==

General Changes
* Simple queries like `SELECT DISTINCT c1, c2.. FROM T WHERE .,. LIMIT 1000` will now start streaming results as soon as the first value is available

@kaikalur kaikalur requested review from highker and rongrong February 28, 2022 20:02
@kaikalur
Copy link
Contributor Author

We will revert/cleanup the previous hash based distinct limit PR once this is merged as that's not helping as much as I expected.

@kaikalur kaikalur requested a review from mbasmanova February 28, 2022 20:12
@kaikalur kaikalur changed the title Optimize some operations for small limit values Optimize DistinctLimit for small limit values Feb 28, 2022
@kaikalur
Copy link
Contributor Author

Abandoning this. Trying to find other ways of improving distinct limit :)

@kaikalur kaikalur changed the title Optimize DistinctLimit for small limit values [Abandoned] Optimize DistinctLimit for small limit values Mar 13, 2022
@stale
Copy link

stale bot commented Sep 21, 2022

This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the task, make sure you've addressed reviewer comments, and rebase on the latest master. Thank you for your contributions!

@stale stale bot added the stale label Sep 21, 2022
@kaikalur kaikalur closed this Feb 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant