-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace state.train_dataloader
and state.eval_dataloader
with just state.dataloader
#363
Labels
enhancement
New (engineering) enhancements, such as features or API changes.
Comments
ravi-mosaicml
added a commit
that referenced
this issue
Mar 25, 2022
…ps_per_epoch`. 1. Made the `state.dataloader` optional, since it will not be provided on `__init__` as part of #40. 2. Binding the active dataloader to the state on `Event.FIT_START`, and switching the dataloader to each evaluation dataloader before `Event.EVAL_START`. Restoring the previous (training) dataloader after `Event.EVAL_END`. 3. Moved `Event.EVAL_START` and `Event.EVAL_END` to run for each evaluator, instead of once for all evaluators. With #40, the eval() will take in a dataloader, which would then require `Event.EVAL_START` and `Event.EVAL_END`. This change also permits for algorithms that wish to modify (each) evalution dataloader. 4. Moved scaling of the LR schedulers to `Trainer.fit()` before `Event.FIT_START` fires. Schedulers will be passed in on `Trainer.fit()` as part of #40. 5. Removed `steps_per_epoch` as part of the state. Instead, algorithms and callbacks can read len(state.dataloader) directly. While this change will make schedulers no longer accurate when using `train_subset_num_batches`, that flag should only be used for performance measurements. As such, it is not necessarry that SSR behaves correctly for performance runs. Added a warning for the `train_subset_num_batches` field. Implements the first part of #40. Closes #363.
ravi-mosaicml
added a commit
that referenced
this issue
Apr 22, 2022
…ved evaluators from the state. (#832) 1. Made the `state.dataloader` and `state.max_duration` optional, since they may not be provided on `__init__` as part of #40. 2. Bind `dataloader_len` to the state, so algorithms can know how many batches to expect per epoch 3. Added the `dataloader_label` to the state, so algorithms know which dataloader is currently running. 4. Binding the active dataloader to the state on `Event.FIT_START`, and switching the dataloader to each evaluation dataloader before `Event.EVAL_START`. Restoring the previous (training) dataloader after `Event.EVAL_END`. 5. Moved `Event.EVAL_START` and `Event.EVAL_END` to run for each evaluator, instead of once for all evaluators. With #40, the eval() will take in a dataloader, which would then call `Event.EVAL_START` and `Event.EVAL_END` for each dataloader. This change also permits for algorithms that wish to modify (each) evaluation dataloader. 6. Moved scaling of the LR schedulers to `Trainer.fit()` before `Event.FIT_START` fires. Schedulers will be passed in on `Trainer.fit()` as part of #40. 7. Remove the evaluators from the state. Addresses part of #329. The user need not set state.evaluators; instead, the trainer will keep track of the evaluators. 8. Removed the `precision_context` from state. Instead, added a helper static function to precision.py. Implements the first part of #40. Closes #363.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
#329 will result in
state.train_dataloader
andstate.eval_dataloader
on the State. However, algorithms and callbacks should only be interested in the currently activate dataloader.For example, callbacks should not be touching the training dataloader, as evaluation could be running mid-epoch. For callbacks that simply want to run on every batch (like TQDM), it makes the logic more complicated since they will have to check which dataloader to use.
Instead,
state.train_dataloader
andstate.eval_dataloader
can be merged into one field -- calledstate.dataloader
. The trainer can be responsible for setting this to the correct dataloader that is being actively iterated.state.dataloader
should be set to the training dataloader, so dataloader augmentations can be applied.Event.EVAL_START
, the trainer should setstate.dataloader
to the dataloader of the evaluator being evaluated. AfterEvent.EVAL_END
, the trainer should revertstate.dataloader
back to the training dataloader (i.e. the value before swapping it atEvent.EVAL_START
).TODO:
The text was updated successfully, but these errors were encountered: