Skip to content

Official pytorch implementation for CVPR2022 paper "Bootstrapping ViTs: Towards Liberating Vision Transformers from Pre-training"

License

Notifications You must be signed in to change notification settings

zhfeing/Bootstrapping-ViTs-pytorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bootstrapping ViTs

Towards liberating vision Transformers from pre-training.

Official code for paper Bootstrapping ViTs: Towards Liberating Vision Transformers from Pre-training

Authors: Haofei Zhang, Jiarui Duan, Mengqi Xue, Jie Song, Li Sun, Mingli Song

Results (Top-1 Accuracy)

1. CIFAR

Model Method CIFAR-10 CIFAR-100
CNNs EfficientNet-B2
ResNet50
Agent-S
Agent-B
94.14
94.92
94.18
94.83
75.55
77.57
74.62
74.78
ViTs ViT-S
ViT-S-SAM
ViT-S-Sparse
ViT-B
ViT-B-SAM
ViT-B-Sparse
87.32
87.77
87.43
79.24
86.57
83.87
61.25
62.60
62.29
53.07
58.18
57.22
Pre-trained ViTs ViT-S
ViT-B
95.70
97.17
80.91
84.95
Ours Joint Agent-S
ViT-S
Agent-B
ViT-B
94.90
95.14
95.06
95.00
74.06
76.19
76.57
77.83
Ours Shared Agent-S
ViT-S
Agent-B
ViT-B
93.22
93.72
92.66
93.34
74.06
75.50
74.11
75.71

2. ImageNet

Method 5% images 10% images 50% images
ResNet50
Agent-B
35.43
35.28
50.86
47.46
70.05
68.13
ViT-B
ViT-B-SAM
ViT-B-Sparse
16.60
16.67
10.39
28.11
28.66
28.92
63.40
64.37
66.01
Ours-Joint
Ours-Shared
36.01
33.06
49.73
45.75
71.36
66.48

Quick Start

1. Prepare dataset

  • CIFAR: download cifar dataset to folder ~/datasets/cifar (you may specify this in configuration files).
  • ImageNet: download ImageNet dataset to folder ~/datasets/ILSVRC2012 and pre-process with this script.
  • We also support other datasets such as CUB200, Sketches, Stanford Cars, TinyImageNet.

2. Prepare cv-lib-PyTorch

Our code requires cv-lib-PyTorch. You should download this repo and checkout to tag bootstrapping_vits.

cv-lib-PyTorch is an open source repo currently maintained by me.

3. Requirements

  • torch>=1.10.2
  • torchvision>=0.11.3
  • tqdm
  • timm
  • tensorboard
  • scipy
  • PyYAML
  • pandas
  • numpy

4. Train from scratch

In dir config, we provide some configurations for training, including CIFAR100 and ImageNet-10%. The following script will start training agent-small from scratch on CIFAR100.

For training with SAM optimizer, the option --worker should be set to sam_train_worker.

export PYTHONPATH=/path/to/cv-lib-PyTorch
export CUDA_VISIBLE_DEVICES=0,1

port=9872
python dist_engine.py \
    --num-nodes 1 \
    --rank 0 \
    --master-url tcp://localhost:${port} \
    --backend nccl \
    --multiprocessing \
    --file-name-cfg cls \
    --cfg-filepath config/cifar100/cnn/agent-small.yaml \
    --log-dir run/cifar100/cnn/agent-small \
    --worker worker

5. Ours Joint

export PYTHONPATH=/path/to/project/cv-lib-PyTorch
export CUDA_VISIBLE_DEVICES=0,1

port=9873
python dist_engine.py \
    --num-nodes 1 \
    --rank 0 \
    --master-url tcp://localhost:${port} \
    --backend nccl \
    --multiprocessing \
    --file-name-cfg joint \
    --cfg-filepath config/cifar100/joint/agent-small-vit-small.yaml \
    --log-dir run/cifar100/joint/agent-small-vit-small \
    --use-amp \
    --worker mutual_worker

6. Ours Shared

export PYTHONPATH=/path/to/project/cv-lib-PyTorch
export CUDA_VISIBLE_DEVICES=0,1

port=9873
python dist_engine.py \
    --num-nodes 1 \
    --rank 0 \
    --master-url tcp://localhost:${port} \
    --backend nccl \
    --multiprocessing \
    --file-name-cfg shared \
    --cfg-filepath config/cifar100/shared/agent-base-res_like-vit-base.yaml \
    --log-dir run/cifar100/shared/agent-base-res_like-vit-base \
    --use-amp \
    --worker mutual_worker

After training, the accuracy of the final epoch is reported instead of the best one.

Citation

If you found this work useful for your research, please cite our paper:

@article{zhang2021bootstrapping,
  title={Bootstrapping ViTs: Towards Liberating Vision Transformers from Pre-training},
  author={Zhang, Haofei and Duan, Jiarui and Xue, Mengqi and Song, Jie and Sun, Li and Song, Mingli},
  journal={arXiv preprint arXiv:2112.03552},
  year={2021}
}

About

Official pytorch implementation for CVPR2022 paper "Bootstrapping ViTs: Towards Liberating Vision Transformers from Pre-training"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages