🚀 Good news! We have have created a demo showcasing the capabilities of our illumination correction model whithin the full document refinement pipeline. Check it out here!
This is the repository will contain the code for our paper which has been accepted at the International Conference on Computer Vision Workshop on New Ides in Vision Transformers (NIVT-ICCV2023).
The project page can be found here.
We highly recommend to use the provided Devcontainer to make the usage as easy as possible:
- Install Docker and VS Code
- Install VS Code Devcontainer extension
ms-vscode-remote.remote-containers
- Clone the repository
git clone https://github.com/FelixHertlein/inv3d-model.git
- Press
F1
(orCTRL + SHIFT + P
) and selectDev Containers: Rebuild and Reopen Container
- Go to
Run and Debug (CTRL + SHIFT + D)
and press the run button, alternatively pressF5
python3 inference.py --model illtr_template@inv3d@full --dataset inv3d_real_unwarp --gpu 0
The models will be downloaded automatically before the inference starts.
Available models are:
- illtr@inv3d
- illtr_template@inv3d@full
- illtr_template@inv3d@pad=0
- illtr_template@inv3d@pad=64
- illtr_template@inv3d@pad=128
Inv3DRealUnwarped will be downloaded automatically when passing inv3d_real_unwarp
as the dataset argument.
To unwarp your own data, you can mout your data inside the container using the .devcontainer/devcontainer.json
config.
Mount your data folder to /workspaces/inv3d-model/input/YOUR_DATA_DIRECTORY
.
Make sure, all samples contain an image norm_image.png
and the corresponding template template.png
(only for template-based models) within the sample subdirectory.
All unwarped images are placed in the output
folder.
Download Inv3D here, combine all downloads and mount it using the devcontainer.json, such that the file tree looks as follows:
input/inv3d
|-- data
| |-- test
| |-- train
| |-- val
| `-- wc_min_max.json
|-- log.txt
`-- settings.json
python3 train.py --model illtr_template --dataset inv3d --version v1 --gpu 0 --num_workers 32
python3 train.py --model illtr_template --dataset inv3d --version v1 --gpu 0 --num_workers 32 --resume
models/TRAINED_MODEL/
|-- checkpoints
| |-- checkpoint-epoch=00-val_mse_loss=0.0015.ckpt
| `-- last.ckpt
|-- logs
| |-- events.out.tfevents.1698250741.d6258ba74799.433.0
| |-- ...
| `-- hparams.yaml
`-- model.py
train.py [-h]
--model MODEL
--dataset DATASET
--gpu GPU
--num_workers NUM_WORKERS
[--version VERSION]
[--fast_dev_run]
[--model_kwargs MODEL_KWARGS]
[--resume]
Training script
options:
-h, --help show this help message and exit
--model {illtr,illtr_template}
Select the model for training.
--dataset {inv3d} Select the dataset to train on.
--gpu GPU The index of the GPU to use for training.
--num_workers NUM_WORKERS
The number of workers as an integer.
--version VERSION Specify a version id for given training. Optional.
--fast_dev_run Enable fast development run (default is False).
--model_kwargs MODEL_KWARGS
Optional model keyword arguments as a JSON string.
--resume Resume from a previous run (default is False).
The pretrained models will be downloaded automatically before the evaluation starts.
Available models are:
- illtr@inv3d
- illtr_template@inv3d@full
- illtr_template@inv3d@pad=0
- illtr_template@inv3d@pad=64
- illtr_template@inv3d@pad=128
Inv3DRealUnwarped will be downloaded automatically when passing inv3d_real_unwarp
as the dataset argument.
Include the Inv3D dataset as described in section Training > Training dataset > Inv3d.
python3 eval.py --trained_model geotr_template@inv3d@v1 --dataset inv3d_real --gpu 0 --num_workers 16
Evaluation output:
models/TRAINED_MODEL
|-- checkpoints
| `-- ...
|-- eval
| `-- inv3d_real_unwarp
| `-- results.csv
`-- inference
`-- inv3d_real_unwarp
|-- 00411
| |-- warped_document_crumpleseasy_bright
| | |-- ill_image.png
| | `-- orig_image.png
| |-- ...
|-- ...
eval.py [-h]
--trained_model MODEL
--dataset DATASET
--gpu GPU
--num_workers NUM_WORKERS
Evaluation script
options:
-h, --help show this help message and exit
--trained_model {illtr@inv3d,
illtr_template@inv3d@full,
illtr_template@inv3d@pad=0,
illtr_template@inv3d@pad=128,
illtr_template@inv3d@pad=64,
... own trained models}
Select the model for evaluation.
--dataset {inv3d_real_unwarp}
Select the dataset to evaluate on.
--gpu GPU The index of the GPU to use for training.
--num_workers NUM_WORKERS
The number of workers as an integer.
If you use the code of our paper for scientific research, please consider citing
@inproceedings{Hertlein2023,
title={Template-Guided Illumination Correction for Document Images with Imperfect Geometric Reconstruction},
author={Hertlein, Felix and Naumann, Alexander},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={904--913},
year={2023}
}
The model IllTr is part of DocTr. IllTrTemplate is based on IllTr.
This project is licensed under MIT.