Skip to content

[ICLR2024] The official implementation of paper "UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling", by Haoyu Lu, Yuqi Huo, Guoxing Yang, Zhiwu Lu, Wei Zhan, Masayoshi Tomizuka, Mingyu Ding.

License

Notifications You must be signed in to change notification settings

RERV/UniAdapter

Repository files navigation

UniAdapter

[ICLR2024] The official implementation of paper "UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling", by Haoyu Lu, Yuqi Huo, Guoxing Yang, Zhiwu Lu, Wei Zhan, Masayoshi Tomizuka, Mingyu Ding.

Getting Started

  • Python3, PyTorch>=1.8.0, torchvision>=0.7.0 are required for the current codebase.
  • To install the other dependencies, run
    pip install -r requirements.txt

Image-text Retrieval

  • Download COCO and Flickr30k datasets from the original websites, and set 'image_root' in configs/retrieval_{dataset}.yaml accordingly.

  • To parameter-efficient finetune on MSCOCO/Flickr:

python -m torch.distributed.run --nproc_per_node=8 train_retrieval.py --config ./configs/retrieval_{coco, flickr}.yaml --output_dir output/{coco, flickr} 
  • To evaluate UniAdapter on MSCOCO/Flickr:
python -m torch.distributed.run --nproc_per_node=8 train_retrieval.py --config ./configs/retrieval_{coco, flickr}.yaml --output_dir output/{coco, flickr} --evaluate 

Visual Question Answerring

  • Download VQA v2 dataset and Visual Genome dataset from the original websites, and set 'vqa_root' and 'vg_root' in configs/vqa.yaml.

  • To parameter-efficient finetune on VQAv2:

python -m torch.distributed.run --nproc_per_node=8 train_vqa.py --config ./configs/vqa.yaml --output_dir $static_dir
  • To evaluate UniAdapter on VQAv2 (need to update the result file to the official server):
python -m torch.distributed.run --nproc_per_node=8 train_vqa.py --config ./configs/vqa.yaml --output_dir $static_dir --evaluate 

Video-text Retrieval and VideoQA

  • In progress.

Acknowledgement

Our codebase is built based on BLIP, timm. We thank the authors for the nicely organized code!

About

[ICLR2024] The official implementation of paper "UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling", by Haoyu Lu, Yuqi Huo, Guoxing Yang, Zhiwu Lu, Wei Zhan, Masayoshi Tomizuka, Mingyu Ding.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages