Authors: Bowen Yin, Xuying Zhang, Qibin Hou, Bo-Yuan Sun, Deng-Ping Fan, & Luc Van Gool.
This official repository contains the source code, prediction results, and evaluation toolbox of paper 'CamoFormer: Masked Separable Attention for Camouflaged Object Detection'. The technical report could be found at arXiv. The whole benchmark results can be found at One Drive, Baidu Netdisk, or Google Drive.
Figure 1: Overall architecture of our CamoFormer model. First, a pretrained Transformer-based backbone is utilized to extract multi-scale features of the input image. Then, the features from the last three stages are aggregated to generate the coarse prediction. Next, the
progressive refinement decoder equipped with masked separable attention (MSA) is applied to gradually polish the prediction results. All
the predictions generated by our CamoFormer are supervised by the ground truth (GT).
- [2022/12/09] Releasing the codebase of CamoFormer and the whole COD benchmarking results (21 models).
- [2022/12/08] Creating repository.
We invite all to contribute in making it more acessible and useful. If you have any questions about our work, feel free to contact me via e-mail (bowenyin@mail.nankai.edu.cn). If you are using our code and evaluation toolbox for your research, please cite this paper (BibTeX).
0. Install
conda create --name CamoFormer python=3.8.5
conda activate CamoFormer
conda install pytorch==1.12.1 torchvision==0.13.1 cudatoolkit=11.3 -c pytorch
pip install opencv-python
conda install tensorboard
conda install tensorboardX
pip install timm
pip install matplotlib
pip install scipy
pip install einops
Please also install [apex](https://github.com/NVIDIA/apex).
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
1. Download Datasets and Checkpoints.
- Datasets:
By default, you can put datasets into the folder 'dataset'.
- Checkpoints:
By default, you can put datasets into the folder 'checkpoint'.
CamoFormer: Baidu Netdisk, One Drive Backbone: Baidu Netdisk, One Drive
2. Test.
bash test.sh
3. Eval.
bash eval.sh
Figure 2: Diagrammatic details of the proposed F-TA in our MSA. Our B-TA shares a similar structure except for the mask.
The prediction of our CamoFprmer can be found in One Drive, Baidu Netdisk, or Google Drive. Here are quantitative performance comparison.
Figure 3: Comparison of our CamoFormer with the recent SOTA methods. ‘-R’: ResNet, ‘-C’: ConvNext, ‘-S’: Swin Transformer, ‘-P’: PVTv2. As can be seen, our CamoFormer-P performs much better than previous methods with either CNN- or
Transformer-based models. ‘↑’: the higher the better, ‘↓’: the lower the better.
Thanks mczhuge providing a friendly codebase for binary segmentation tasks. And our code is built based on it.
You may want to cite:
@article{yin2024camoformer,
title={Camoformer: Masked separable attention for camouflaged object detection},
author={Yin, Bowen and Zhang, Xuying and Fan, Deng-Ping and Jiao, Shaohui and Cheng, Ming-Ming and Van Gool, Luc and Hou, Qibin},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2024},
publisher={IEEE}
}
Code in this repo is for non-commercial use only.