Skip to content

ImaginaryNet: Learning Object Detectors without Real Images and Annotations

Notifications You must be signed in to change notification settings

kodenii/ImaginaryNet

Repository files navigation

ImaginaryNet: Learning Object Detectors without Real Images and Annotations

License: MIT

This repository is for the ICLR 2023 paper: ImaginaryNet: Learning Object Detectors without Real Images and Annotations

If you use any source codes or ideas included in this repository for your work, please cite the following paper.

@article{ni2022imaginarynet,
  title={ImaginaryNet: Learning Object Detectors without Real Images and Annotations},
  author={Ni, Minheng and Huang, Zitong and Feng, Kailai and Zuo, Wangmeng},
  journal={arXiv preprint arXiv:2210.06886},
  year={2022}
}

If you have any questions, feel free to email me.

Abstract

Without the demand of training in reality, humans are able of detecting a new category of object simply based on the language description on its visual characteristics. Empowering deep learning with this ability undoubtedly enables the neural network to handle complex vision tasks, e.g., object detection, without collecting and annotating real images. To this end, this paper introduces a novel challenging learning paradigm Imaginary-Supervised Object Detection (ISOD), where neither real images nor manual annotations are allowed for training object detectors. To resolve this challenge, we propose ImaginaryNet, a framework to synthesize images by combining pretrained language model and text-to-image synthesis model. Given a class label, the language model is used to generate a full description of a scene with a target object, and the text-to-image model is deployed to generate a photo-realistic image. With the synthesized images and class labels, weakly supervised object detection can then be leveraged to accomplish ISOD. By gradually introducing real images and manual annotations, ImaginaryNet can collaborate with other supervision settings to further boost detection performance. Experiments show that ImaginaryNet can (i) obtain about 75% performance in ISOD compared with the weakly supervised counterpart of the same backbone trained on real data, (ii) significantly improve the baseline while achieving state-of-the-art or comparable performance by incorporating ImaginaryNet with other supervision settings.

Illustration of Framework

Preparation

You can run the following commands to start up the environment.

conda env create -f environment.yaml

conda activate imaginarynet

pip install --upgrade jax==0.3.25 jaxlib==0.3.25+cuda11.cudnn82 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

conda install -c conda-forge cudatoolkit-dev

Pipeline Usage

This pipeline provide the core function of ImaginaryNet: to generate images based on class label.

Quick Start

python imaginarynet.py --num 10000 --classfile voc.txt --gpt --clip --backend dalle-mini

Parameters Explanation

  • --seed Random seed.
  • --num Number of generated images.
  • --classfile Initial classes.
  • --outputdir Output dir.
  • --gpt Use GPT to extend prompt or not.
  • --clip Use CLIP to filter image or not.
  • --backend Use dalle-mini or stablediffusion.
  • --cpu Use CLIP as filter on CPU or not.
  • --threshold The min score CLIP can accept.

Reproducibility

To help improve the reproducibility of the community, we provide generated datasets, trained checkpoints, and training logs. Please note that generated images may not be re-generated exactly the same because of the update of the backend and the change of the environment. We did not modify the code of detection backbones. To start training of these backbones, please refer to their original repos. If you want to access the original data or experiments, please download our archives.

Generated Images

Name Download Link
10,000 Imaginary Data Download

Save Checkpoints and Logs

Imaginary-Supervised Object Detection (ISOD)

Backbone Imaginary Data mAP Checkpoint Log
OICR 5K Imaginary 35.43 Download Download

Weakly-Supervised Object Detection (WSOD)

Backbone Imaginary Data mAP Checkpoint Log
WSDDN 5K Imaginary 39.90 Download Download
OICR 5K Imaginary 51.39 Download Download
W2N 5K Imaginary 65.05 Download Download

Semi-Supervised Object Detection (SSOD)

Backbone Real Data Imaginary Data mAP Checkpoint Log
Unbiased-Teacher 5K VOC2007 5K Imaginary 80.36 Download Download
Unbiased-Teacher 5K VOC2007 10K Imaginary 80.60 Download Download
Unbiased-Teacher 5K VOC2007 + 10K VOC2012 (un-labeled) 10K Imaginary 81.60 Download Download

Acknowledgement

We greatly appreciate Yeli Shen for his contribution in the public code of ImaginaryNet.

About

ImaginaryNet: Learning Object Detectors without Real Images and Annotations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published