Replace `state.train_dataloader` and `state.eval_dataloader` with just `state.dataloader` #363

ravi-mosaicml · 2022-02-03T19:20:07Z

#329 will result in state.train_dataloader and state.eval_dataloader on the State. However, algorithms and callbacks should only be interested in the currently activate dataloader.

For example, callbacks should not be touching the training dataloader, as evaluation could be running mid-epoch. For callbacks that simply want to run on every batch (like TQDM), it makes the logic more complicated since they will have to check which dataloader to use.

Instead, state.train_dataloader and state.eval_dataloader can be merged into one field -- called state.dataloader. The trainer can be responsible for setting this to the correct dataloader that is being actively iterated.

Upon init, state.dataloader should be set to the training dataloader, so dataloader augmentations can be applied.
Before Event.EVAL_START, the trainer should set state.dataloader to the dataloader of the evaluator being evaluated. After Event.EVAL_END, the trainer should revert state.dataloader back to the training dataloader (i.e. the value before swapping it at Event.EVAL_START).

TODO:

@hanlint stamp of approval

The text was updated successfully, but these errors were encountered:

…ps_per_epoch`. 1. Made the `state.dataloader` optional, since it will not be provided on `__init__` as part of #40. 2. Binding the active dataloader to the state on `Event.FIT_START`, and switching the dataloader to each evaluation dataloader before `Event.EVAL_START`. Restoring the previous (training) dataloader after `Event.EVAL_END`. 3. Moved `Event.EVAL_START` and `Event.EVAL_END` to run for each evaluator, instead of once for all evaluators. With #40, the eval() will take in a dataloader, which would then require `Event.EVAL_START` and `Event.EVAL_END`. This change also permits for algorithms that wish to modify (each) evalution dataloader. 4. Moved scaling of the LR schedulers to `Trainer.fit()` before `Event.FIT_START` fires. Schedulers will be passed in on `Trainer.fit()` as part of #40. 5. Removed `steps_per_epoch` as part of the state. Instead, algorithms and callbacks can read len(state.dataloader) directly. While this change will make schedulers no longer accurate when using `train_subset_num_batches`, that flag should only be used for performance measurements. As such, it is not necessarry that SSR behaves correctly for performance runs. Added a warning for the `train_subset_num_batches` field. Implements the first part of #40. Closes #363.

…ved evaluators from the state. (#832) 1. Made the `state.dataloader` and `state.max_duration` optional, since they may not be provided on `__init__` as part of #40. 2. Bind `dataloader_len` to the state, so algorithms can know how many batches to expect per epoch 3. Added the `dataloader_label` to the state, so algorithms know which dataloader is currently running. 4. Binding the active dataloader to the state on `Event.FIT_START`, and switching the dataloader to each evaluation dataloader before `Event.EVAL_START`. Restoring the previous (training) dataloader after `Event.EVAL_END`. 5. Moved `Event.EVAL_START` and `Event.EVAL_END` to run for each evaluator, instead of once for all evaluators. With #40, the eval() will take in a dataloader, which would then call `Event.EVAL_START` and `Event.EVAL_END` for each dataloader. This change also permits for algorithms that wish to modify (each) evaluation dataloader. 6. Moved scaling of the LR schedulers to `Trainer.fit()` before `Event.FIT_START` fires. Schedulers will be passed in on `Trainer.fit()` as part of #40. 7. Remove the evaluators from the state. Addresses part of #329. The user need not set state.evaluators; instead, the trainer will keep track of the evaluators. 8. Removed the `precision_context` from state. Instead, added a helper static function to precision.py. Implements the first part of #40. Closes #363.

ravi-mosaicml added the Needs Design Needs Design Further design is required. Do not start implementation until design questions are reso label Feb 3, 2022

ravi-mosaicml added this to the Backlog milestone Feb 15, 2022

ravi-mosaicml added the enhancement New (engineering) enhancements, such as features or API changes. label Feb 28, 2022

ravi-mosaicml modified the milestones: Backlog, v0.5 Feb 28, 2022

ravi-mosaicml mentioned this issue Mar 25, 2022

[Eval-Only]: Optional timing and dataloader attributes on state; removed evaluators from the state. #832

Merged

ravi-mosaicml removed the Needs Design Needs Design Further design is required. Do not start implementation until design questions are reso label Mar 30, 2022

ravi-mosaicml closed this as completed in #832 Apr 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace `state.train_dataloader` and `state.eval_dataloader` with just `state.dataloader` #363

Replace `state.train_dataloader` and `state.eval_dataloader` with just `state.dataloader` #363

ravi-mosaicml commented Feb 3, 2022 •

edited

Loading

Replace state.train_dataloader and state.eval_dataloader with just state.dataloader #363

Replace state.train_dataloader and state.eval_dataloader with just state.dataloader #363

Comments

ravi-mosaicml commented Feb 3, 2022 • edited Loading

Replace `state.train_dataloader` and `state.eval_dataloader` with just `state.dataloader` #363

Replace `state.train_dataloader` and `state.eval_dataloader` with just `state.dataloader` #363

ravi-mosaicml commented Feb 3, 2022 •

edited

Loading