-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to specify custom tags to decide the assignment of servers #30
Conversation
You should add some integration test cases. |
Yes, i’m doing it. Sorry not to add it firstly. |
Em...If we use HDFS to store data, the shuffle server will not store data, |
server/src/main/java/org/apache/uniffle/server/ShuffleServer.java
Outdated
Show resolved
Hide resolved
client-mr/src/main/java/org/apache/hadoop/mapreduce/v2/app/RssMRAppMaster.java
Outdated
Show resolved
Hide resolved
client-spark/common/src/main/java/org/apache/spark/shuffle/RssSparkShuffleUtils.java
Outdated
Show resolved
Hide resolved
client-spark/common/src/main/java/org/apache/spark/shuffle/RssSparkShuffleUtils.java
Outdated
Show resolved
Hide resolved
@@ -56,6 +58,9 @@ public class RssClientConfig { | |||
// When the size of read buffer reaches the half of JVM region (i.e., 32m), | |||
// it will incur humongous allocation, so we set it to 14m. | |||
public static String RSS_CLIENT_READ_BUFFER_SIZE_DEFAULT_VALUE = "14m"; | |||
// The tags specified by rss client to determine shuffle data placement. | |||
public static String RSS_CLIENT_DATA_PLACEMENT_TAGS = "rss.client.data.placement.tags"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we give it a better name ?
The reason is as below:
#30 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rss.client.shuffle-server.assignment.tags ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rss.client.assignment.tags
will be better.
integration-test/common/src/test/java/org/apache/uniffle/test/AssignmentWithTagsTest.java
Show resolved
Hide resolved
@@ -56,6 +58,9 @@ public class RssClientConfig { | |||
// When the size of read buffer reaches the half of JVM region (i.e., 32m), | |||
// it will incur humongous allocation, so we set it to 14m. | |||
public static String RSS_CLIENT_READ_BUFFER_SIZE_DEFAULT_VALUE = "14m"; | |||
// The tags specified by rss client to determine shuffle data placement. | |||
public static String RSS_CLIENT_DATA_PLACEMENT_TAGS = "rss.client.data.placement.tags"; | |||
public static String RSS_CLIENT_DATA_PLACEMENT_TAGS_DEFAULT_VALUES = Constants.SHUFFLE_SERVER_VERSION; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A shuffle server can have multiple tags. The version tag shouldn't be overrided the configurable tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok,I will remove the default value of this conf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, our client tags should contain version tag in any time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got your thought.
Codecov Report
@@ Coverage Diff @@
## master #30 +/- ##
============================================
+ Coverage 56.83% 56.87% +0.03%
- Complexity 1204 1207 +3
============================================
Files 152 152
Lines 8401 8429 +28
Branches 813 816 +3
============================================
+ Hits 4775 4794 +19
- Misses 3368 3376 +8
- Partials 258 259 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for your contribution.
Would you like to add some docs about this feature in another pr? Because this feature brings some user-facing changes, we need some docs. |
Glad to do. I will open another PR to finish. |
What changes were proposed in this pull request?
Allow to specify custom tags to decide data placement.
Changelog
rss.server.tags
for shuffle serverrss.client.assignment.tags
for client to choose the data placementWhy are the changes needed?
Sometimes, we hope the specified spark/mr job's shuffle data can be stored the specified group of shuffle servers, maybe due to DC location and so on.
Does this PR introduce any user-facing change?
Yes.
How was this patch tested?
UTs.