Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timm support #262

Merged
merged 38 commits into from
Feb 1, 2022
Merged

Timm support #262

merged 38 commits into from
Feb 1, 2022

Conversation

A-Jacobson
Copy link
Contributor

  • Timm model wrapper with MosaicClassifier
  • Timm Hparams interface
  • Timm resnet50 yaml
  • working W&B run

@hanlint , @Landanjs

Copy link
Contributor

@hanlint hanlint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this design looks good. We might want a common schema across huggingface and TIMM though. So for hugging-face, our models would then be:

model
  hf:
     model_name: '<name_here>'

Would that work for you @moinnadeem ?

@A-Jacobson
Copy link
Contributor Author

A-Jacobson commented Jan 21, 2022

Timm is all image classification on imagenet so was super straightforward.

looking at HF, they have model factories such as AutoModelForMaskedLM for each task. We can use the same pattern, but we'd probably have to have a wrapper for each factory. @moinnadeem

So hf may look like:

masked language modeling

model
  hf-maskedlm:
     model_name: '<name_here>'

question answering

model
  hf-qa:
    model_name

and so on.

@hanlint
Copy link
Contributor

hanlint commented Jan 21, 2022

Does model_name always map to a unique model factory type in HF? If so, we could just require hf, model_name and auto-infer the factory type.

@A-Jacobson
Copy link
Contributor Author

I don't think so. I'm seeing "Bert-base" show up in multiple examples. Though, I'm not sure of the differences between tasks. It may just be different weights or a different classifier head.

Copy link
Contributor

@Landanjs Landanjs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 2 comments on docstrings and a question on if we should include a **kwargs argument.

@mosaicml/research-engineering does anyone know if we could have a kwargs-like argument in yahp? Maybe defining a dict hparam kwargs: dict = hp.optional(default={}), then using it in a yaml like: kwargs.image_size: 512?

@jbloxham
Copy link
Contributor

LGTM! 2 comments on docstrings and a question on if we should include a **kwargs argument.

@mosaicml/research-engineering does anyone know if we could have a kwargs-like argument in yahp? Maybe defining a dict hparam kwargs: dict = hp.optional(default={}), then using it in a yaml like: kwargs.image_size: 512?

YAHP supports JSON types, at least to some extent. I don't have any experience using it.

@moinnadeem
Copy link
Contributor

I don't think so. I'm seeing "Bert-base" show up in multiple examples. Though, I'm not sure of the differences between tasks. It may just be different weights or a different classifier head.

Yeah, the bert-base-uncased is a single set of weights, but doing AutoModelForClassification.from_pretrained("bert-base-uncased") will create some randomly initialized weights for the classifier head.

In other words, there is a bijection between model names and weights.

@moinnadeem
Copy link
Contributor

Timm is all image classification on imagenet so was super straightforward.

looking at HF, they have model factories such as AutoModelForMaskedLM for each task. We can use the same pattern, but we'd probably have to have a wrapper for each factory. @moinnadeem

So hf may look like:

masked language modeling

model
  hf-maskedlm:
     model_name: '<name_here>'

question answering

model
  hf-qa:
    model_name

and so on.

Hm, we have something very similar to this at the moment. I would like to make our current version more into a factory soon (see how we have very thin wrappers around the current HF configs). However, we do need to insert some additional variables, such as config.num_labels in the wrapper.

A-Jacobson and others added 3 commits January 27, 2022 22:49
Co-authored-by: Landan Seguin <landanjs@gmail.com>
@A-Jacobson A-Jacobson requested a review from hanlint January 29, 2022 05:39
Copy link
Contributor

@hanlint hanlint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a few suggested changes. Do you have a w&b run with the TIMM resnet50 to verify convergence?

@hanlint hanlint added this to the v0.4 milestone Jan 30, 2022
@A-Jacobson
Copy link
Contributor Author

A-Jacobson commented Jan 31, 2022

Looks good, a few suggested changes. Do you have a w&b run with the TIMM resnet50 to verify convergence?

https://wandb.ai/mosaic-ml/timm-imagenet/reports/Shared-panel-22-01-30-20-01-86--VmlldzoxNTAzMTky

resnet50 convergence run to 76.89 accuracy. note on the yaml said _quality = '76.51' so it seems we're within the margin for error.

Copy link
Contributor

@hanlint hanlint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hanlint hanlint merged commit fa1b992 into dev Feb 1, 2022
@hanlint hanlint deleted the timm-support branch February 1, 2022 18:17
A-Jacobson added a commit that referenced this pull request Feb 10, 2022
coryMosaicML pushed a commit to coryMosaicML/composer that referenced this pull request Feb 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants