Implement Spark executor node affinity using custom resource #366

pang-wu · 2023-07-17T07:48:11Z

What Problem Does the PR Solve

This PR enable raydp pin Spark executors on a specific set of machines in a Ray cluster -- a feature we found useful in our production. It is useful when the Ray cluster contains heterogeneous workloads (i.e. workloads using both Spark and Native Ray) and the total resource is less than the max resource Spark could potentially request -- which is possible in following cases:

User specify large number of Spark executors, which is more than the total amount of CPU in the cluster. In this case it could cause other Ray workload starving because Spark occupied all resources.
Dynamic allocation requests more actors than total CPU -- this is possible when dynamic allocation starts

In both scenarios, we want to limit the scheduling of the Spark executor actors into a subset of machines to avoid the Spark job take all resources in the ray cluster and starve other Ray workloads.
Another scenario this feature is useful is the Spark cluster needs to be schedule on special nodes i.e. spot vs. on demand.

This feature could also benefit multi-tenant Ray clusters where different users want to run their job on different nodegroups.
With the new feature, we can define a set of machines that only for scheduling Spark executors in the Ray cluster, for example:

# Ray cluster config
  # ....
  spark_on_spot:  # Spark only nodes
    resources:
      spark_executor: 100 # custom resource indicates these node group is for Spark only
    min_workers: 2
    max_workers: 10  # changing this also need to change the global max_workers
    node_config:
      # ....
  general_spot:  # Nodes for general Ray workloads
    min_workers: 2
    max_workers: 10  # changing this also need to change the global max_workers
    node_config:
      # ...

Then when initialize Spark session:

spark = raydp.init_spark(app_name='RayDP Example',
                         num_executors=executor_count,
                         executor_cores=3,
                         executor_memory=1 * 1024 * 1024 * 1024,
                         configs = {
                             ...
                             'spark.ray.raydp_spark_executor.actor.resource.spark_executor': 1,  # Schedule executor on nodes with custom resource spark_executor
                         })

carsonwang · 2023-07-18T06:39:56Z

Thanks for the contributing the PR. This will be very useful. Can you please also update this in the document? Previously you introduced node affinity for the Spark driver and introduced the configuration like spark.ray.raydp_spark_master.actor.resource.spark_master. For this new configuration which is for executor, should we add raydp_executor to the name to be consistent like spark.ray.raydp_executor.actor.resource.spark_executor?

pang-wu · 2023-07-18T08:11:06Z

@carsonwang Thanks for reviewing. Good point, I changed the config to spark.ray.raydp_spark_executor.actor.resource.*
Also updated the doc

kira-lin · 2023-07-19T03:45:35Z

why delete the test test_spark_on_fractional_custom_resource?

pang-wu · 2023-07-19T16:16:07Z

@kira-lin When doing my local test, I found that test interferes with the new tests -- it could be the Spark context is partially created. I can bring it back.
Another issue is the CI is failing -- I probably need some help there..

kira-lin · 2023-07-20T02:38:18Z

@pang-wu Look at raydp.yml:108, that test is run explicitly. CI failure may be due to no tests are selected.

pang-wu · 2023-07-20T23:46:28Z

@kira-lin fixed.

kira-lin · 2023-07-25T02:46:08Z

@pang-wu thanks

* Implement Spark executor node affinity using custom resource * reduce test flakeness * Update document

pang-wu force-pushed the pang/exec-affinity2 branch from d53f214 to b8c6e74 Compare July 17, 2023 08:12

pang-wu force-pushed the pang/exec-affinity2 branch 2 times, most recently from 90b43a8 to e6535aa Compare July 18, 2023 08:09

pang-wu force-pushed the pang/exec-affinity2 branch from 13aa306 to 4e1ac6b Compare July 20, 2023 05:27

pang-wu added 3 commits July 20, 2023 10:26

Implement Spark executor node affinity using custom resource

d1331e7

reduce test flakeness

b7ea407

Update document

e194fcd

pang-wu force-pushed the pang/exec-affinity2 branch from 4e1ac6b to e194fcd Compare July 20, 2023 17:27

kira-lin merged commit 9a77b96 into oap-project:master Jul 25, 2023

kira-lin pushed a commit that referenced this pull request Jul 28, 2023

Implement Spark executor node affinity using custom resource (#366)

f8ce299

* Implement Spark executor node affinity using custom resource * reduce test flakeness * Update document

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Spark executor node affinity using custom resource #366

Implement Spark executor node affinity using custom resource #366

pang-wu commented Jul 17, 2023 •

edited

Loading

carsonwang commented Jul 18, 2023

pang-wu commented Jul 18, 2023

kira-lin commented Jul 19, 2023

pang-wu commented Jul 19, 2023

kira-lin commented Jul 20, 2023

pang-wu commented Jul 20, 2023

kira-lin commented Jul 25, 2023

Implement Spark executor node affinity using custom resource #366

Implement Spark executor node affinity using custom resource #366

Conversation

pang-wu commented Jul 17, 2023 • edited Loading

What Problem Does the PR Solve

carsonwang commented Jul 18, 2023

pang-wu commented Jul 18, 2023

kira-lin commented Jul 19, 2023

pang-wu commented Jul 19, 2023

kira-lin commented Jul 20, 2023

pang-wu commented Jul 20, 2023

kira-lin commented Jul 25, 2023

pang-wu commented Jul 17, 2023 •

edited

Loading