-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Spark executor node affinity using custom resource #366
Conversation
d53f214
to
b8c6e74
Compare
Thanks for the contributing the PR. This will be very useful. Can you please also update this in the document? Previously you introduced node affinity for the Spark driver and introduced the configuration like |
90b43a8
to
e6535aa
Compare
@carsonwang Thanks for reviewing. Good point, I changed the config to |
why delete the test test_spark_on_fractional_custom_resource? |
@kira-lin When doing my local test, I found that test interferes with the new tests -- it could be the Spark context is partially created. I can bring it back. |
@pang-wu Look at raydp.yml:108, that test is run explicitly. CI failure may be due to no tests are selected. |
13aa306
to
4e1ac6b
Compare
4e1ac6b
to
e194fcd
Compare
@kira-lin fixed. |
@pang-wu thanks |
* Implement Spark executor node affinity using custom resource * reduce test flakeness * Update document
What Problem Does the PR Solve
This PR enable
raydp
pin Spark executors on a specific set of machines in a Ray cluster -- a feature we found useful in our production. It is useful when the Ray cluster contains heterogeneous workloads (i.e. workloads using both Spark and Native Ray) and the total resource is less than the max resource Spark could potentially request -- which is possible in following cases:In both scenarios, we want to limit the scheduling of the Spark executor actors into a subset of machines to avoid the Spark job take all resources in the ray cluster and starve other Ray workloads.
Another scenario this feature is useful is the Spark cluster needs to be schedule on special nodes i.e. spot vs. on demand.
This feature could also benefit multi-tenant Ray clusters where different users want to run their job on different nodegroups.
With the new feature, we can define a set of machines that only for scheduling Spark executors in the Ray cluster, for example:
Then when initialize Spark session: