-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ExperimentHparams
class; Set state.train_dataloader
#966
ExperimentHparams
class; Set state.train_dataloader
#966
Conversation
…ps_per_epoch`. 1. Made the `state.dataloader` optional, since it will not be provided on `__init__` as part of mosaicml#40. 2. Binding the active dataloader to the state on `Event.FIT_START`, and switching the dataloader to each evaluation dataloader before `Event.EVAL_START`. Restoring the previous (training) dataloader after `Event.EVAL_END`. 3. Moved `Event.EVAL_START` and `Event.EVAL_END` to run for each evaluator, instead of once for all evaluators. With mosaicml#40, the eval() will take in a dataloader, which would then require `Event.EVAL_START` and `Event.EVAL_END`. This change also permits for algorithms that wish to modify (each) evalution dataloader. 4. Moved scaling of the LR schedulers to `Trainer.fit()` before `Event.FIT_START` fires. Schedulers will be passed in on `Trainer.fit()` as part of mosaicml#40. 5. Removed `steps_per_epoch` as part of the state. Instead, algorithms and callbacks can read len(state.dataloader) directly. While this change will make schedulers no longer accurate when using `train_subset_num_batches`, that flag should only be used for performance measurements. As such, it is not necessarry that SSR behaves correctly for performance runs. Added a warning for the `train_subset_num_batches` field. Implements the first part of mosaicml#40. Closes mosaicml#363.
It can be useful for algorithms and callbacks to know which dataloader is active, so added the `dataloader_label` to the state. Removed `evaluators` from state, as nothing is using that anymore.
…er into ravi/optional_dataloader
…()`. Preferable to keep variables on the state object rather than as trainer members, where appropriate. Before, the state.schedulers was empty after `__init__()` but before `fit()`. Now, state.schedulers contains the compiled composer schedulers or original pytorch schedulers. Restored optimizers on Event.INIT; it will be a bigger issue to rewrite algs to not depend on optimizers on init.
* Removed `precision_context` from state * Switched `train_subset_num_batches` and `eval_subset_num_batches` to use `-1` as the default value instead of `None`.
…er into ravi/optional_dataloader
This is a non-breaking change; all existing trainer hparams will work as-is. |
…b.com:ravi-mosaicml/ravi-composer into experiment_hparams
f8bd71b
to
d3689af
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
improves the codebase - approved!
ExperimentHparams
classExperimentHparams
class; Set state.train_dataloader
ExperimentHparams
class. This class describes how to run a training job that may have multiple calls toTrainer.fit
and/orTrainer.eval
. Specifically,ExperimentHparams.initialize_object()
returns a(Trainer, List[FitKwargs], List[EvalKwargs])
tuple, that then the user's entrypoint can consome.This class does not automatically train the model, nor does it include an entrypoint.
FitKwargs
andEvalKwargs
, along with test cases to ensure they stay in sync with the Trainer signature.Trainer.fit()
#948, which removed the setting ofState.train_dataloader
. Added back the lines to correctly set the train dataloader.