[AIR] Ensure that driver `DatasetContext` is propagated to the `Trainer` actor. #29192

clarkzinzow · 2022-10-07T23:10:58Z

DatasetContext allows setting Datasets config values like the target max block size. Normally these are propagated from the driver to all tasks in the Dataset.

However, when using the AIR trainer, context changes made by the driver don't get propagated to the training dataset preprocessor. I think this is likely because the BatchMapper doesn't save the context, so when we pass the BatchMapper to the worker that actually creates the final Dataset, the driver's context values are lost.

Related issue number

Closes #29160

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

python/ray/train/base_trainer.py

…er` actor. (ray-project#29192) DatasetContext allows setting Datasets config values like the target max block size. Normally these are propagated from the driver to all tasks in the Dataset. However, when using the AIR trainer, context changes made by the driver don't get propagated to the training dataset preprocessor. I think this is likely because the BatchMapper doesn't save the context, so when we pass the BatchMapper to the worker that actually creates the final Dataset, the driver's context values are lost. Signed-off-by: Weichen Xu <weichen.xu@databricks.com>

clarkzinzow assigned stephanie-wang, matthewdeng and amogkam Oct 7, 2022

clarkzinzow changed the title ~~[Datasets] Ensure that driver DatasetContext is propagated to the Trainer actor.~~ [AIR] Ensure that driver DatasetContext is propagated to the Trainer actor. Oct 7, 2022

amogkam reviewed Oct 8, 2022

View reviewed changes

python/ray/train/base_trainer.py Outdated Show resolved Hide resolved

clarkzinzow added 2 commits October 11, 2022 20:51

Ensure that driver DatasetContext is propagated to the Trainer actor.

fcbd075

Gate Datasets import on whether the Trainer has been given a Dataset.

8c0c89c

clarkzinzow force-pushed the air/fix/propagate-driver-context-to-trainer-actor branch from 56d354a to 8c0c89c Compare October 11, 2022 20:52

clarkzinzow requested a review from amogkam October 11, 2022 22:08

Fix test.

e24bfcc

amogkam approved these changes Oct 13, 2022

View reviewed changes

clarkzinzow merged commit d62373f into ray-project:master Oct 13, 2022

scottjlee mentioned this pull request Sep 14, 2023

[data] datacontext setting does not work for preprocessor #39237

Closed

amogkam mentioned this pull request Sep 21, 2023

[Data] Propagate driver DataContext to RayTrainWorkers #39698

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AIR] Ensure that driver `DatasetContext` is propagated to the `Trainer` actor. #29192

[AIR] Ensure that driver `DatasetContext` is propagated to the `Trainer` actor. #29192

clarkzinzow commented Oct 7, 2022

[AIR] Ensure that driver DatasetContext is propagated to the Trainer actor. #29192

[AIR] Ensure that driver DatasetContext is propagated to the Trainer actor. #29192

Conversation

clarkzinzow commented Oct 7, 2022

Related issue number

Checks

[AIR] Ensure that driver `DatasetContext` is propagated to the `Trainer` actor. #29192

[AIR] Ensure that driver `DatasetContext` is propagated to the `Trainer` actor. #29192