-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core] Introduce spill_on_unavailable option for soft NodeAffinitySchedulingStrategy #34224
Conversation
…vailable (ray-project#33856)" This reverts commit adb6775.
""" | ||
|
||
def __init__(self, node_id: str, soft: bool): | ||
def __init__(self, node_id: str, soft: bool, _spill_on_unavailable: bool = False): | ||
# This will be removed once we standardize on node id being hex string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a TODO to promote this to public API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #34283 to track this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, let's also verify this fixes the release test
@@ -166,6 +166,7 @@ def __call__(self, args): | |||
args["scheduling_strategy"] = NodeAffinitySchedulingStrategy( | |||
self.locs[self.i], | |||
soft=True, | |||
_spill_on_unavailable=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we don't allow to spill to another node when options.locality_with_output==True
? locality_with_output
is not the default behavior, and when users opt in to enable locality_with_output
, they probably don't want any surprise that task running on an arbitrary node. cc @ericl for opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Locality is always best effort--- so this is fine. And less locality is preferred in order to be work preserving in all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved for picking into 2.4. cc @clarng
Can you test the 100tb shuffle loadtest? |
dataset_shuffle_push_based_random_shuffle_100tb succeeded: https://buildkite.com/ray-project/release-tests-pr/builds/34544#018770ad-310d-4c5f-ba36-1ec2433c0497 |
Failed tests are unrelated. |
…edulingStrategy (#34224) Introduce a private _spill_on_unavailable semantic for soft NodeAffinitySchedulingStrategy. Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com>
…edulingStrategy (ray-project#34224) Introduce a private _spill_on_unavailable semantic for soft NodeAffinitySchedulingStrategy. Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com> Signed-off-by: elliottower <elliot@elliottower.com>
…edulingStrategy (ray-project#34224) Introduce a private _spill_on_unavailable semantic for soft NodeAffinitySchedulingStrategy. Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com> Signed-off-by: Jack He <jackhe2345@gmail.com>
Why are these changes needed?
Introduce a private
_spill_on_unavailable
semantic for soft NodeAffinitySchedulingStrategy. In 2.4 this will only be used by Dataset and we will figure out how to properly expose this as a public API in 2.5.Related issue number
Closes #34170
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.