-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove the newer SqlQueryScheduler #21952
Conversation
If it is a useful/functional piece we should keep it. Who knows maybe we will find use for it! |
c54d2e5
to
395b46d
Compare
The problem is that it's buggy and it's not clear how much effort it would take to be confident enough to use it in production. If the feature were important for us now, i would 100% say let's keep it and take it that last mile. But it's not really a priority, so in the meantime it's tech debt that makes it harder to add features to the scheduler (you need to add everything in two places in case one day someone decides we should actually productionize and use the new scheduler). |
395b46d
to
f24196e
Compare
If it comes in handy later, we can simply recover it from Git history. I agree with @rschlussel that there's costs to maintaining this class, and while for this one class it might be small, dead code over the whole project it is a significant technical burden and I think we should err on the side of removing unused functionality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please take a look at the failing tests as well.
"\n" + | ||
"a. Print VLOG(2) and lower messages from mapreduce.{h,cc}\n" + | ||
"b. Print VLOG(1) and lower messages from file.{h,cc}\n" + | ||
"c. Print VLOG(3) and lower messages from files prefixed with \"gfs\"\n" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. stray spaces?
Remove the "new" SqlQueryScheduler, which was created to enable stage retries, but was never rolled out and is no longer important because of Presto-on-Spark. Rename LegacySqlQueryScheduler to SqlQueryScheduler.
f24196e
to
8bc6081
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Remove the "new" SqlQueryScheduler, which was created to support stage retries, but was never rolled out and is no longer important because of Presto-on-Spark. Rename LegacySqlQueryScheduler to SqlQueryScheduler.
Description
Remove the SqlQueryScheduler and rename LegacySqlQueryScheduler back to SqlQueryScheduler. Remove corresponding session properties that are no longer relevant.
Motivation and Context
The new SqlQueryScheduler, which was built to enable stage retries when materializing exchanges has never been rolled out, and has a bug that causes it to use a lot of cpu that has not been resolved (was disabled here, #14264 and an attempt to fix it here #14879 did not resolve the issue).
The new scheduler was deprioritized with the investment in presto-on-spark. This PR removes the "new" scheduler, which has been tech debt for a while since we are not investing in fixing it and making it stable for production.
The LegacySqlQueryScheduler has been the default, so there is no user facing change except for removing the ability to try out the new scheduler.
Fixes #21886
Impact
Removes ability to use the experimental SqlQueryScheduler that allowed for stage retries
Test Plan
CI
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.