Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIR] Ensure that driver DatasetContext is propagated to the Trainer actor. #29192

Conversation

clarkzinzow
Copy link
Contributor

DatasetContext allows setting Datasets config values like the target max block size. Normally these are propagated from the driver to all tasks in the Dataset.

However, when using the AIR trainer, context changes made by the driver don't get propagated to the training dataset preprocessor. I think this is likely because the BatchMapper doesn't save the context, so when we pass the BatchMapper to the worker that actually creates the final Dataset, the driver's context values are lost.

Related issue number

Closes #29160

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@clarkzinzow clarkzinzow changed the title [Datasets] Ensure that driver DatasetContext is propagated to the Trainer actor. [AIR] Ensure that driver DatasetContext is propagated to the Trainer actor. Oct 7, 2022
@clarkzinzow clarkzinzow force-pushed the air/fix/propagate-driver-context-to-trainer-actor branch from 56d354a to 8c0c89c Compare October 11, 2022 20:52
@clarkzinzow clarkzinzow requested a review from amogkam October 11, 2022 22:08
@clarkzinzow clarkzinzow merged commit d62373f into ray-project:master Oct 13, 2022
WeichenXu123 pushed a commit to WeichenXu123/ray that referenced this pull request Dec 19, 2022
…er` actor. (ray-project#29192)

DatasetContext allows setting Datasets config values like the target max block size. Normally these are propagated from the driver to all tasks in the Dataset.

However, when using the AIR trainer, context changes made by the driver don't get propagated to the training dataset preprocessor. I think this is likely because the BatchMapper doesn't save the context, so when we pass the BatchMapper to the worker that actually creates the final Dataset, the driver's context values are lost.

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[AIR] DatasetContext values don't get propagated for BatchMapper preprocessors
4 participants