-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#1554] feat(spark): Fetch dynamic client conf as early as possible #1557
Conversation
Test Results 2 340 files ± 0 2 340 suites ±0 4h 33m 42s ⏱️ + 2m 43s Results for commit 33ac575. ± Comparison against base commit 9737d57. This pull request removes 1 and adds 5 tests. Note that renamed tests count towards both.
♻️ This comment has been updated with latest results. |
@@ -78,6 +78,7 @@ public void testCreateFallback() throws Exception { | |||
conf.set(RssSparkConfig.RSS_DYNAMIC_CLIENT_CONF_ENABLED.key(), "false"); | |||
conf.set(RssSparkConfig.RSS_ACCESS_ID.key(), "mockId"); | |||
conf.set(RssSparkConfig.RSS_ENABLED.key(), "true"); | |||
conf.set(RssSparkConfig.RSS_STORAGE_TYPE.key(), "MEMORY_LOCALFILE"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to add this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test further down asserts an exception is thrown. Moving RssSparkShuffleUtils.validateRssClientConf(sparkConf)
up in the RssShuffleManager
constructor throws a different exception if the storage type is not set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coordinator can apply the storage type to the client. Will this pull request affect this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test calls into new RssShuffleManager(sparkConf, false)
, which mimics creating the shuffle manager on an executor, and executors do not fetch dynamic config, so the given sparkConf
must have the storage type set for the test.
The logic of applying config is unchanged, this only happens earlier. I presume there are tests for this use case, which should check the desired behaviour is preserved.
client-spark/spark3/src/main/java/org/apache/spark/shuffle/RssShuffleManager.java
Outdated
Show resolved
Hide resolved
client-spark/spark3/src/main/java/org/apache/spark/shuffle/RssShuffleManager.java
Outdated
Show resolved
Hide resolved
client-spark/common/src/main/java/org/apache/spark/shuffle/RssSparkShuffleUtils.java
Show resolved
Hide resolved
And I think the PR title is not proper for this change, which don't reflect the feature of overwriting the existing client conf. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1557 +/- ##
============================================
+ Coverage 53.96% 54.94% +0.98%
- Complexity 2858 2862 +4
============================================
Files 438 418 -20
Lines 24793 22455 -2338
Branches 2109 2111 +2
============================================
- Hits 13379 12339 -1040
+ Misses 10576 9348 -1228
+ Partials 838 768 -70 ☔ View full report in Codecov by Sentry. |
this is not new, this existed before |
b687162
to
5d5cd9d
Compare
client-spark/spark3/src/main/java/org/apache/spark/shuffle/RssShuffleManager.java
Show resolved
Hide resolved
187ed27
to
d97f826
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Could you help resolve the conflict? @EnricoMi |
d97f826
to
cc564bb
Compare
cc564bb
to
33ac575
Compare
Fixed. |
What changes were proposed in this pull request?
Fetches dynamic client config as early as possible, to be able to use dynamic client config to create the shuffle client with updated config.
Why are the changes needed?
Providing config or the shuffle client via coordinator is operationally useful as cluster-wide settings can be deployed through the cluster and changed over time. Clients and apps do not need to change configs.
Fixes Spark part of #1554
Does this PR introduce any user-facing change?
More configs can be provided via coordinators.
How was this patch tested?
Existing and follow-up unit tests.