-
Notifications
You must be signed in to change notification settings - Fork 433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated the DataSpec
for the timing abstraction (#146) parts 3 and 4
#178
Conversation
For the timing abstraction (#146), the `DataloaderSpec` needed two addition functions -- `get_num_samples_in_batch` and `get_num_tokens_in_batch`. It was getting messy to pass around function pointers in a named tuple, so instead converted `DataloaderSpec` from a NamedTuple into a regular class called `DataSpec`. Custom datasets can inherit the base `DataSpec` class and override functionality as needed. Moved the `DataSpec` class to `composer.core`, as the `DataSpec` is now bound directly to the state. #120 will also need this change. Renamed `train_dataloader` and `eval_dataloader` in the trainer and state to `train_data` and `eval_data`, since it encompasses more than the dataloader. This PR implements part 3 and 4 of the timing abstraction (#146). The implementation differs from the GH issue by adding `num_tokens`, `num_samples`, `get_batch_size`, and `get_num_tokens` to the new `DataSpec` rather than the pytorch dataset class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made a few suggestions-- I know this PR was just to update the DataSpec
, but I think we should take this opportunity to remove DataSpec
completely for better usability.
As discussed offline, we will leave dataspec for now and leave it attached to the state. A later PR will remove it. For the time being, however, dataspecs can be initialized using a dictionary which serves as kwargs for the class constructor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
…s 3 and 4 (mosaicml#178) For the timing abstraction (mosaicml#146), the DataloaderSpec needed two addition functions -- get_num_samples_in_batch and get_num_tokens_in_batch. Moved the DataSpec class to composer.core, as the DataSpec is now bound directly to the state. mosaicml#120 will also need this change. This PR implements part 3 and 4 of the timing abstraction (mosaicml#146). The implementation differs from the GH issue by adding num_tokens, num_samples, get_batch_size, and get_num_tokens to the new DataSpec rather than the pytorch dataset class.
For the timing abstraction (#146), the
DataloaderSpec
needed two addition functions --get_num_samples_in_batch
andget_num_tokens_in_batch
.Moved the
DataSpec
class tocomposer.core
, as theDataSpec
is now bound directly to the state. #120 will also need this change.This PR implements part 3 and 4 of the timing abstraction (#146). The implementation differs from the GH issue by adding
num_tokens
,num_samples
,get_batch_size
, andget_num_tokens
to the newDataSpec
rather than the pytorch dataset class.