Skip to content

[CVPR 2024] Code for HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation

License

Notifications You must be signed in to change notification settings

zhangce01/HiKER-SGG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[CVPR 2024] HiKER-SGG

arXiv License: MIT

👀Introduction

This repository contains the code for our CVPR 2024 paper HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation. [Paper] [Website]

💡Environment

We test our codebase with PyTorch 1.12.0 with CUDA 11.6. Please install corresponding PyTorch and CUDA versions according to your computational resources.

Then install the rest of required packages by running pip install -r requirements.txt. This includes jupyter, as you need it to run the notebooks.

⏳Setup

We use the Visual Genome dataset in this work, which consists of 108,077 images, each annotated with objects and relations. Following previous work, we filter the dataset to use the most frequent 150 object classes and 50 predicate classes for experiments.

You can download the images here, then extract the two zip files and put all the images in a single folder:

Part I: https://cs.stanford.edu/people/rak248/VG_100K_2/images.zip

Part II: https://cs.stanford.edu/people/rak248/VG_100K_2/images2.zip

Then download VG metadata preprocessed by IMP: annotations, class info,and image metadata and copy those three files in a single folder.

Finally, update config.py to with a path to the aforementioned data, as well as the absolute path to this directory.

We also provide two pre-trained weights:

  1. The pre-trained Faster-RCNN checkpoint trained by MotifNet from https://www.dropbox.com/s/cfyqhskypu7tp0q/vg-24.tar?dl=0 and place in checkpoints/vgdet/vg-24.

  2. The pre-trained GB-Net checkpoint vgrel-11 from https://github.com/alirezazareian/gbnet and place in checkpoints/vgdet/vgrel-11.

If you want to train from scratch, you can pre-train the model using Faster-RCNN checkpoint. However, we recommend to train from the GB-Net checkpoint.

📦Usage

You can simply follow the instructions in the notebooks to run HiKER-SGG experiments:

  1. For the PredCls task: train: ipynb/train_predcls/hikersgg_predcls_train.ipynb, evaluate: ipynb/eval_predcls/hikersgg_predcls_test.ipynb.
  2. For the SGCls task: train: ipynb/train_sgcls/hikersgg_sgcls_train.ipynb, evaluate: ipynb/eval_sgcls/hikersgg_sgcls_train.ipynb.

Note that for the PredCls task, we start training from the GB-Net checkpoint; and for the SGCls task, we start training from the best PredCls checkpoint.

📈VG-C Benchmark

In our paper, we introduce a new synthetic VG-C benchmark for SGG, containing 20 challenging image corruptions, including simple transformations and severe weather conditions.

We include the code for generating these 20 corruptions in dataloaders/corruptions.py. To use it, you also need to modify the codes in dataloaders/visual_genome.py, and also enable -test_n in the evaluation notebook file.

🙏Acknowledgements

Our codebase is adapted from GB-Net and EB-Net. We thank the authors for releasing their code!

📧Contact

If you have any questions, please contact at cezhang@cs.cmu.edu.

📌 BibTeX & Citation

If you find this code useful, please consider citing our work:

@inproceedings{zhang2024hiker,
  title={HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation},
  author={Zhang, Ce and Stepputtis, Simon and Campbell, Joseph and Sycara, Katia and Xie, Yaqi},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={28233--28243},
  year={2024}
}