Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add auto-tuning of max executor and hash partition #18716

Merged
merged 1 commit into from
Dec 19, 2022

Conversation

vermapratyush
Copy link
Member

@vermapratyush vermapratyush commented Nov 22, 2022

Test Plan

Added unit test to the classes modified.
Ran presto server on local and submitted queries on "customers" table to confirm correct hash_partition_count is used.

Summary

Added 2 configuration options (spark_executor_allocation_strategy_enabled, spark_hash_partition_count_allocation_strategy_enabled) to auto-tune, spark.dynamicAllocation.maxExecutors and hash_partition_count based on the input table stats, respectively.

The auto tune configuration option take precedence over explicitly provided values for spark.dynamicAllocation.maxExecutors and hash_partition_count.

If spark_resource_allocation_strategy_enabled is enabled, both executor and hash partition are considered to be auto-tune enabled. This configuration option could be removed in subsequent revision, as it is redundant.

== RELEASE NOTES ==

General Changes
* Add property ``spark_executor_allocation_strategy_enabled`` to auto-tune spark max executor count (``spark.dynamicAllocation.maxExecutors``)  based on input data. Only required if ``spark_resource_allocation_strategy_enabled`` is not already enabled.
* Add property ``spark_hash_partition_count_allocation_strategy_enabled`` to auto-tune hash partition count (``hash_partition_count``) based on input data. Only required if ``spark_resource_allocation_strategy_enabled`` is not already enabled.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Nov 22, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: vermapratyush / name: Pratyush Verma (1ca1e3296362bb6a3b977061034fd96af0843237)

@vermapratyush vermapratyush marked this pull request as ready for review November 22, 2022 23:03
@vermapratyush vermapratyush requested a review from a team as a code owner November 22, 2022 23:03
@v-jizhang
Copy link
Contributor

@bot kick off tests

1 similar comment
@vermapratyush
Copy link
Member Author

@bot kick off tests

@vermapratyush vermapratyush force-pushed the exec-hash-alloc-prpty-1 branch from 8c076fd to d88f57f Compare December 5, 2022 16:33
@vermapratyush vermapratyush requested review from pgupta2 and removed request for presto-oss December 6, 2022 15:17
@ajaygeorge
Copy link
Contributor

@vermapratyush Can you please sign the CLA

@ajaygeorge
Copy link
Contributor

@vermapratyush Please squash the commits as well.

@vermapratyush vermapratyush force-pushed the exec-hash-alloc-prpty-1 branch from 15587ef to 3197e4a Compare December 6, 2022 21:00
@vermapratyush
Copy link
Member Author

@vermapratyush Can you please sign the CLA

@ajaygeorge Already signed the CLA, but somehow it still keeps failing.

@ajaygeorge
Copy link
Contributor

@vermapratyush Can you please sign the CLA

@ajaygeorge Already signed the CLA, but somehow it still keeps failing.

Can you try signing the individual contributor CLA as well if not already.

@vermapratyush vermapratyush force-pushed the exec-hash-alloc-prpty-1 branch 2 times, most recently from 1ca1e32 to db890d3 Compare December 6, 2022 22:27
Copy link
Contributor

@ajaygeorge ajaygeorge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments

@vermapratyush vermapratyush force-pushed the exec-hash-alloc-prpty-1 branch 2 times, most recently from e410779 to eb5f715 Compare December 8, 2022 01:23
@vermapratyush vermapratyush requested review from ajaygeorge and pgupta2 and removed request for pgupta2 and ajaygeorge December 8, 2022 01:55
@vermapratyush
Copy link
Member Author

Unrelated failure in module presto-cache.
Re-ran the test locally and was successful.

@vermapratyush vermapratyush force-pushed the exec-hash-alloc-prpty-1 branch from eb5f715 to 45c214d Compare December 9, 2022 11:50

double inputDataInBytes = new PrestoSparkSourceStatsCollector(metaData, session).collectSourceStats(plan);
if (!anyAllocationStrategyEnabled(session)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do the check for the main property here: isSparkResourceAllocationStrategyEnabled(session) and if its disabled, we return back. This is basically a single switch to disable allocation strategy completely.

Basically, my thinking is that spark_resource_allocation_strategy_enabled control this whole logic. If this is disabled, we skip this logic completely. If spark_resource_allocation_strategy_enabled is true, then only we check for individual strategy properties and trigger the one that is enabled.

I know this is bit weird. Ideally, we should just remove spark_resource_allocation_strategy_enabled session property to make this logic cleaner

@highker : What is the right approach of deprecating/removing a property?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the right approach of deprecating/removing a property?

  • We use @DefunctConfig annotation in a config class. Usually, we also rename the property to deprecated.XXX. Check FeatureConfig for examples. We don't usually delete the property directly.
  • We have release note to tell users what config has the deprecated one migrated to

return defaultResourceSettings;
}
// update hashPartitionCount only if resource allocation or hash partition allocation is enabled
if (isSparkResourceAllocationStrategyEnabled(session) || isSparkHashPartitionCountAllocationStrategyEnabled(session)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check for isSparkResourceAllocationStrategyEnabled(session) can be removed here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will deprecate this config in subsequent PR.

Copy link
Contributor

@pgupta2 pgupta2 Dec 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make it an AND condition? That will give us the ability to control individual optimizations separately. Else if isSparkResourceAllocationStrategyEnabled(session) is true, both optimizations will run regardless.

log.warn(String.format("Failed to retrieve correct size, data read=%.2f, skipping automatic resource tuning.", inputDataInBytes));
return DISABLED_PHYSICAL_RESOURCE_SETTING;
// update maxExecutorCount only if resource allocation or executor allocation is enabled
if (isSparkResourceAllocationStrategyEnabled(session) || isSparkExecutorAllocationStrategyEnabled(session)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will deprecate this config in subsequent PR.

Copy link
Contributor

@ajaygeorge ajaygeorge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % a few nits.

Copy link
Contributor

@pgupta2 pgupta2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just 1 final comment.

return defaultResourceSettings;
}
// update hashPartitionCount only if resource allocation or hash partition allocation is enabled
if (isSparkResourceAllocationStrategyEnabled(session) || isSparkHashPartitionCountAllocationStrategyEnabled(session)) {
Copy link
Contributor

@pgupta2 pgupta2 Dec 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make it an AND condition? That will give us the ability to control individual optimizations separately. Else if isSparkResourceAllocationStrategyEnabled(session) is true, both optimizations will run regardless.

@vermapratyush
Copy link
Member Author

@pgupta2 I don't want to couple isSparkResourceAllocationStrategyEnabled and isSparkHashPartitionCountAllocationStrategyEnabled together since isSparkResourceAllocationStrategyEnabled will be going away in subsequent PR anyway.

@vermapratyush vermapratyush force-pushed the exec-hash-alloc-prpty-1 branch 2 times, most recently from 4df8b74 to 6a7b385 Compare December 16, 2022 18:59
@vermapratyush vermapratyush force-pushed the exec-hash-alloc-prpty-1 branch from 6a7b385 to 3ce7856 Compare December 16, 2022 19:48
@rschlussel rschlussel merged commit ddca5d1 into prestodb:master Dec 19, 2022
@wanglinsong wanglinsong mentioned this pull request Jan 12, 2023
30 tasks
@vermapratyush vermapratyush deleted the exec-hash-alloc-prpty-1 branch April 7, 2023 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants