forked from OndrejTexler/Few-Shot-Patch-Based-Training
-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit eb2591d
Showing
13 changed files
with
1,666 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
# Interactive Video Stylization Using Few-Shot Patch-Based Training | ||
|
||
The official implementation of | ||
|
||
> **Interactive Video Stylization Using Few-Shot Patch-Based Training** </br> | ||
_[O. Texler](https://ondrejtexler.github.io/), [D. Futschik](https://dcgi.fel.cvut.cz/people/futscdav), | ||
[M. Kučera](https://www.linkedin.com/in/kuceram/), [O. Jamriška](https://dcgi.fel.cvut.cz/people/jamriond), | ||
[Š. Sochorová](https://dcgi.fel.cvut.cz/people/sochosar), [M. Chai](http://www.mlchai.com), | ||
[S. Tulyakov](http://www.stulyakov.com), and [D. Sýkora](https://dcgi.fel.cvut.cz/home/sykorad/)_ </br> | ||
[[`WebPage`](https://ondrejtexler.github.io/patch-based_training)], | ||
[[`Paper`](https://ondrejtexler.github.io/res/Texler20-SIG_patch-based_training_main.pdf)], | ||
[[`BiBTeX`](#CitingFewShotPatchBasedTraining)] | ||
|
||
|
||
## Run | ||
|
||
Download the [TESTING DATA](https://drive.google.com/file/d/1EscSNFg4ILpB7dxr-zYw_UdOILLmDlRj/view?usp=sharing), and unzip. | ||
The _train folder is expected to be next to the _gen folder. | ||
|
||
To train the network, run the `train.py`. | ||
To generate the results, run `generate.py`. | ||
See example commands below: | ||
|
||
``` | ||
train.py --config "_config/reference_P.yaml" | ||
--data_root "Maruska640_train" | ||
--log_interval 1000 | ||
--log_folder logs_reference_P | ||
``` | ||
|
||
Every 1000 (log_interval) epochs, `train.py` saves the current generator to logs_reference_P (log_folder), and it validates/runs the generator on _gen data - the result is saved in Maruska640_gen/res__P | ||
|
||
|
||
``` | ||
generate.py --checkpoint "Maruska640_train/logs_reference_P/model_00010.pth" | ||
--data_root "Maruska_gen" | ||
--dir_input "input_filtered" | ||
--outdir "Maruska_gen/res_00010" | ||
--device "cuda:0" | ||
``` | ||
|
||
|
||
## Installation | ||
Tested on Windows 10, `Python 3.7.8`, `CUDA 10.2`. | ||
With the following python packages: | ||
``` | ||
numpy 1.19.1 | ||
opencv-python 4.4.0.40 | ||
Pillow 7.2.0 | ||
PyYAML 5.3.1 | ||
scikit-image 0.17.2 | ||
scipy 1.5.2 | ||
torch 1.6.0 | ||
torchvision 0.7.0 | ||
``` | ||
|
||
|
||
## Credits | ||
* This project started when [Ondrej Texler](https://ondrejtexler.github.io/) was an intern at [Snap Inc.](https://www.snap.com/), and it was funded by [Snap Inc.](https://www.snap.com/) and [Czech Technical University in Prague](https://www.cvut.cz/en) | ||
|
||
|
||
## License | ||
The code is released for research purposes only. | ||
|
||
|
||
## <a name="CitingFewShotPatchBasedTraining"></a>Citing | ||
If you find Interactive Video Stylization Using Few-Shot Patch-Based Training useful for your research or work, please use the following BibTeX entry. | ||
|
||
``` | ||
@Article{Texler20-SIG, | ||
author = "Ond\v{r}ej Texler and David Futschik and Michal Ku\v{c}era and Ond\v{r}ej Jamri\v{s}ka and \v{S}\'{a}rka Sochorov\'{a} and Menglei Chai and Sergey Tulyakov and Daniel S\'{y}kora", | ||
title = "Interactive Video Stylization Using Few-Shot Patch-Based Training", | ||
journal = "ACM Transactions on Graphics", | ||
volume = "39", | ||
number = "4", | ||
pages = "73", | ||
year = "2020", | ||
} | ||
``` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
# Generator | ||
generator: &generator_j | ||
type: GeneratorJ | ||
args: | ||
use_bias: True | ||
tanh: True | ||
append_smoothers: True | ||
resnet_blocks: 4 | ||
filters: [32, 64, 128, 128, 128, 64] | ||
input_channels: 3 | ||
|
||
|
||
# Optimizer of Generator | ||
opt_generator: &opt_generator | ||
type: Adam | ||
args: | ||
lr: 0.0002 | ||
betas: [0.9, 0.999] | ||
weight_decay: 0.00001 | ||
|
||
|
||
# Discriminator | ||
discriminator: &discriminatorn | ||
type: DiscriminatorN_IN | ||
args: | ||
num_filters: 12 | ||
n_layers: 2 | ||
|
||
|
||
# Optimizer of Discriminator | ||
opt_discriminator: &opt_discriminator | ||
type: Adam | ||
args: | ||
lr: 0.0002 | ||
betas: [0.9, 0.999] | ||
weight_decay: 0.00001 | ||
|
||
|
||
# Parameters of Perception Loss (VGG-Loss) | ||
perception_loss: &perception_loss | ||
weight: 6.0 | ||
perception_model: | ||
type: PerceptualVGG19 | ||
args: | ||
feature_layers: [0, 3, 5] | ||
use_normalization: False | ||
|
||
|
||
# Training Parameters | ||
trainer: &trainer_1 | ||
batch_size: 1 | ||
epochs: 555555555 | ||
reconstruction_weight: 4. | ||
adversarial_weight: 0.5 | ||
use_image_loss: True | ||
reconstruction_criterion: L1Loss | ||
adversarial_criterion: MSELoss | ||
|
||
|
||
# Training Dataset Parameters | ||
training_dataset: &training_dataset | ||
type: DatasetFullImages | ||
dir_pre: input | ||
dir_post: output | ||
dir_mask: mask | ||
|
||
|
||
# "Main" of this YAML file | ||
job: | ||
training_dataset: *training_dataset | ||
generator: *generator_j | ||
opt_generator: *opt_generator | ||
discriminator: *discriminatorn | ||
opt_discriminator: *opt_discriminator | ||
perception_loss: *perception_loss | ||
trainer: *trainer_1 | ||
|
||
num_workers: 1 | ||
device: "cuda:0" | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
# Generator | ||
generator: &generator_j | ||
type: GeneratorJ | ||
args: | ||
use_bias: True | ||
tanh: True | ||
append_smoothers: True | ||
resnet_blocks: 7 | ||
filters: [32, 64, 128, 128, 128, 64] | ||
input_channels: 3 | ||
|
||
|
||
# Optimizer of Generator | ||
opt_generator: &opt_generator | ||
type: Adam | ||
args: | ||
lr: 0.0004 | ||
betas: [0.9, 0.999] | ||
weight_decay: 0.00001 | ||
|
||
|
||
# Discriminator | ||
discriminator: &discriminatorn | ||
type: DiscriminatorN_IN | ||
args: | ||
num_filters: 12 | ||
n_layers: 2 | ||
|
||
|
||
# Optimizer of Discriminator | ||
opt_discriminator: &opt_discriminator | ||
type: Adam | ||
args: | ||
lr: 0.0004 | ||
betas: [0.9, 0.999] | ||
weight_decay: 0.00001 | ||
|
||
|
||
# Parameters of Perception Loss (VGG-Loss) | ||
perception_loss: &perception_loss | ||
weight: 6.0 | ||
perception_model: | ||
type: PerceptualVGG19 | ||
args: | ||
feature_layers: [0, 3, 5] | ||
use_normalization: False | ||
|
||
|
||
# Training Parameters | ||
trainer: &trainer_1 | ||
batch_size: 40 | ||
epochs: 50000000 | ||
reconstruction_weight: 4. | ||
adversarial_weight: 0.5 | ||
use_image_loss: True | ||
reconstruction_criterion: L1Loss | ||
adversarial_criterion: MSELoss | ||
|
||
|
||
# Training Dataset Parameters | ||
training_dataset: &training_dataset | ||
type: DatasetPatches_M | ||
dir_pre: input_filtered | ||
dir_post: output | ||
dir_mask: mask | ||
patch_size: 32 | ||
|
||
|
||
# "Main" of this YAML file | ||
job: | ||
training_dataset: *training_dataset | ||
generator: *generator_j | ||
opt_generator: *opt_generator | ||
discriminator: *discriminatorn | ||
opt_discriminator: *opt_discriminator | ||
perception_loss: *perception_loss | ||
trainer: *trainer_1 | ||
|
||
num_workers: 1 | ||
device: "cuda:0" | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
# Generator | ||
generator: &generator_j | ||
type: GeneratorJ | ||
args: | ||
use_bias: True | ||
tanh: True | ||
append_smoothers: True | ||
resnet_blocks: 7 | ||
filters: [32, 64, 128, 128, 128, 64] | ||
input_channels: 6 | ||
|
||
|
||
# Optimizer of Generator | ||
opt_generator: &opt_generator | ||
type: Adam | ||
args: | ||
lr: 0.0004 | ||
betas: [0.9, 0.999] | ||
weight_decay: 0.00001 | ||
|
||
|
||
# Discriminator | ||
discriminator: &discriminatorn | ||
type: DiscriminatorN_IN | ||
args: | ||
num_filters: 12 | ||
n_layers: 2 | ||
|
||
|
||
# Optimizer of Discriminator | ||
opt_discriminator: &opt_discriminator | ||
type: Adam | ||
args: | ||
lr: 0.0004 | ||
betas: [0.9, 0.999] | ||
weight_decay: 0.00001 | ||
|
||
|
||
# Parameters of Perception Loss (VGG-Loss) | ||
perception_loss: &perception_loss | ||
weight: 6.0 | ||
perception_model: | ||
type: PerceptualVGG19 | ||
args: | ||
feature_layers: [0, 3, 5] | ||
use_normalization: False | ||
|
||
|
||
# Training Parameters | ||
trainer: &trainer_1 | ||
batch_size: 40 | ||
epochs: 50000000 | ||
reconstruction_weight: 4. | ||
adversarial_weight: 0.5 | ||
use_image_loss: True | ||
reconstruction_criterion: L1Loss | ||
adversarial_criterion: MSELoss | ||
|
||
|
||
# Training Dataset Parameters | ||
training_dataset: &training_dataset | ||
type: DatasetPatches_M | ||
dir_pre: input_filtered | ||
dir_post: output | ||
dir_mask: mask | ||
patch_size: 32 | ||
dir_x1: input_gdisko_gauss_r10_s10 | ||
|
||
|
||
# "Main" of this YAML file | ||
job: | ||
training_dataset: *training_dataset | ||
generator: *generator_j | ||
opt_generator: *opt_generator | ||
discriminator: *discriminatorn | ||
opt_discriminator: *opt_discriminator | ||
perception_loss: *perception_loss | ||
trainer: *trainer_1 | ||
|
||
num_workers: 1 | ||
device: "cuda:0" | ||
|
Oops, something went wrong.