The MaskTrack method is the baseline for state-of-the-art methods in video object segmentation like Video Object Segmentation with Re-identification and[ Lucid Data Dreaming for Multiple Object Tracking] (https://arxiv.org/abs/1703.09554). The top three methods in DAVIS 2017 challenge were based on the MaskTrack method. However, no open source code is available for the MaskTrack method. Here I provide the MaskTrack method with following specifications:
- The code gives a score of 0.466 on the DAVIS 2017 test-dev dataset. J-mean is 0.440 and F-mean is 0.492.
- The code handles multiple objects present in DAVIS 2017.
- Data generation code in matlab for offline training on DAVIS 17 train+val and online training on DAVIS 17 test is also included. Thus, all of the code is packaged together.
Machine configuration used for testing:
- Two 'GeForce GTX 1080 Ti' cards with 11GB memory each.
- CPU RAM memory of 32 GB (though only about 11GB is required)
Offline training is done on DAVIS 2017 train data. The online training and testing is done on DAVIS 2017 test dataset. I recommend using conda for downloading and managing the environments.
Download the Deeplab Resnet 101 pretrained COCO model from here and place it in 'training/pretrained' folder.
If you want to skip offline training and directly perform online training and testing, download the offline trained model from here and place it in 'training/offline_save_dir/lr_0.001_wd_0.001' folder.
What things you need to install the software and how to install them
Software used:
- Pytorch 0.3.1
- Matlab 2017b
- Python 2.7
Dependencies: Create a conda environment using the training/deeplab_resnet_env.yml file. Use: conda env create -f training/deeplab_resnet_env.yml
If you are not using conda as a package manager, refer to the yml file and install the libraries manually.
Please refer to the following instructions:
A. Offline training data generation
- Download the DAVIS 17 train+val dataset from: https://data.vision.ee.ethz.ch/csergi/share/davis/DAVIS-2017-trainval-480p.zip
- Go to generating_masks/offline_training/script_multiple_objs.m and change the base_path variable to point to the DAVIS 2017 dataset (train+val)
- Run script_multiple_objs.m (This will generate the deformations needed for offline training)
B. Setting paths for python files
- Go to training/path.py and change the paths returned by the following methods:
- db_root_dir(): Point this to the DAVIS 17 test dataset (download from: https://data.vision.ee.ethz.ch/csergi/share/davis/DAVIS-2017-test-dev-480p.zip)
- db_offline_train_root_dir(): Point this to the DAVIS 17 train+val dataset
C. Offline training
- Run training/train_offline.py with appropriate parameters. Recommended: --NoLabels 2 --lr 0.001 --wtDecay 0.001 --epochResume 0 --epochs 15 --batchSize 6
D. Online Training data generation
- Go to generating_masks/online_training/script_DAVIS17_test_dev.m and change required paths.
- Run the script with 12 iterations (can be varied) as argument: run script_DAVIS17_test_dev(12)
E. Online Training and testing
- Run training/train_online.py with appropriate parameters. Recommended: --NoLabels 2 --lr 0.0025 --wtDecay 0.0001 --seqName aerobatics --parentEpochResume 8 --epochsOnline 10 --noIterations 5 --batchSize 2
- Results are stored in training/models17_DAVIS17_test/lr_xxx_wd_xxx/seq_name
This project is licensed under the MIT License - see the LICENSE file for details
This code was produced during my internship at Nanyang Technological University under Prof. Guosheng Lin. I would like to thank him for providing access to the GPUs.
- The code for generation of masks was based on: www.mpi-inf.mpg.de/masktrack
- The code for Deeplab Resnet was taken from: https://github.com/isht7/pytorch-deeplab-resnet
- Some of the dataloader code was based on: https://github.com/fperazzi/davis-2017
- Template of Readme.md file: https://gist.github.com/PurpleBooth/109311bb0361f32d87a2
I would like to thank K.K. Maninis for providing this code: https://github.com/kmaninis/OSVOS-PyTorch for reference.