Code is based on https://github.com/jhcho99/CoFormer
We provide instructions for environment setup.
# Clone this repository and navigate into the repository
git clone https://github.com/PYL2077/HiFormer.git
cd HiFormer
# Create a conda environment, activate the environment and install PyTorch via conda
conda create --name HiFormer python=3.9
conda activate HiFormer
conda install pytorch==1.8.0 torchvision==0.9.0 cudatoolkit=11.1 -c pytorch -c conda-forge
# Install requirements via pip
pip install -r requirements.txt
Annotations are given in JSON format, and annotation files are under "SWiG/SWiG_jsons/" directory. Images can be downloaded here. Please download the images and store them in "SWiG/images_512/" directory.
- All images should be under "SWiG/images_512/" directory.
- train.json file is for train set.
- dev.json file is for development set.
- test.json file is for test set.
To train HiFormer on a single node with 4 GPUs, run:
python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py \
--backbone resnet50 --dataset_file swig \
--leaf_epochs 20 --root_epochs 25 \
--preprocess True \
--num_workers 4 --num_enc_layers 6 --num_dec_layers 5 \
--dropout 0.15 --hidden_dim 512 --output_dir HiFormer
- We use AdamW optimizer with learning rate 10-4 (10-5 for backbone), weight decay 10-4 and β = (0.9, 0.999).
- Those learning rates are divided by 10 at epoch 30.
- Random Color Jittering, Random Gray Scaling, Random Scaling and Random Horizontal Flipping are used for augmentation.
python main.py --output_dir HiFormer --dev
python main.py --output_dir HiFormer --test
Model Checkpoint can be downloaded here
To run an inference on a custom image, run:
python inference.py --image_path inference/filename.jpg \
--output_dir inference