We integrate our method into the benchmark "Unsupervised Pathology Detection: A Deep Dive Into the State of the Art" to enable future development and comparisons.
Code for our method is inside UPD_study/models/ours/
.
Download this repository by running
git clone https://github.com/img-cond-diffusion-model-ad
in your terminal.
Create and activate the Anaconda environment:
conda env create -f environment.yml
conda activate anomaly_restoration
Additionally, you need to install the repository as a package:
python3 -m pip install --editable .
To be able to use Weights & Biases for logging follow the instructions at https://docs.wandb.ai/quickstart.
To download and prepare the DDR dataset, run:
bash UPD_study/data/data_preprocessing/prepare_DDR.sh
To download and preprocess ATLAS and BraTS, first download ROBEX from https://www.nitrc.org/projects/robex and extract it in data/data_preprocessing/ROBEX. Then run:
bash UPD_study/data/data_preprocessing/prepare_ATLAS.sh
bash UPD_study/data/data_preprocessing/prepare_BraTS.sh
For ATLAS you need to apply for the data at https://fcon_1000.projects.nitrc.org/indi/retro/atlas.html and receive the encryption password. During the run of prepare_ATLAS.sh you will be prompted to input the password.
For BraTS, Kaggle's API will be used to download the data. To be able to interact with the API, follow the instructions at https://www.kaggle.com/docs/api.
To download the CamCAN data, you need to apply for it at https://camcan-archive.mrc-cbu.cam.ac.uk/dataaccess/index.php. After you download them, put them in data/datasets/MRI/CamCAN and run:
python UPD_study/data/data_preprocessing/prepare_data.py --dataset CamCAN
We recommend using accellerate to train/evaluate models over multiple GPUs.
For any experiment, to select which image modality you use:
--modality [MRI|RF]
Where MRI is for brain MRI and RF is for DDR. To select which sequence of MRI you use:
--sequence [t1|t2]
For evaluating the T1 model, on BraTS-T1 use --brats_t1=f
while for ATLAS use --brats_t1=t
.
In the following script examples I will denote the dataset choice as <DATASET_OPTIONS>
To train a model on fold f
\in [0,9], run:
accelerate launch \
--num_processes=$num_processes --mixed_precision=fp16 \
./UPD_study/models/ours/ours_trainer.py \
--fold=$f <DATASET_OPTIONS>
accelerate launch \
--num_processes=$num_processes --mixed_precision=fp16 \
./UPD_study/models/ours/ours_trainer.py \
--fold=$f -ev=t <DATASET_OPTIONS>
accelerate launch \
--num_processes=$num_processes --mixed_precision=fp16 \
./UPD_study/models/ours/ours_ensemble.py \
--fold=$f -ev=t <DATASET_OPTIONS>
To generate the "Main Results" from Tables 1 and 3 over all three seeds run:
bash UPD_study/experiments/main.sh
Alternatively, for a single seed run:
bash UPD_study/experiments/main_seed10.sh
To generate the "Self-Supervised Pre-training" results from Tables 2 and 4 over all three seeds run:
bash UPD_study/experiments/pretrained.sh
Alternatively, for a single seed run:
bash UPD_study/experiments/pretrained_seed10.sh
To generate the "Complexity Analysis" results from Table 5 run:
bash UPD_study/experiments/benchmarks.sh
To generate "The Effects of Limited Training Data" results from Fig. 3 run:
bash UPD_study/experiments/percentage.sh
The repository contains PyTorch implementations for VAE, r-VAE, f-AnoGAN, H-TAE-S, FAE, PaDiM, CFLOW-AD, RD, ExpVAE, AMCons, PII, DAE and CutPaste.