By Samaneh Azadi, Jiashi Feng, Trevor Darrell at UC Berkeley.
LDDP is proposed to predict a set of diverse and informative proposals with enriched representations which is able to augment object detection architectures. LDDP considers both label-level contextual information and spatial layout relationships between object proposals without increasing the number of parameters of the network, and thus improves location and category specifications of final detected bounding boxes substantially during both training and inference schemes. This implementation is built based on Faster R-CNN framework but can be modified for other detection architectures. For more information on LDDP, please refer to the arxiv preprint which will be published at CVPR 2017.
LDDP is licensed for open non-commercial distribution under the UC Regents license; see LICENSE. Its dependencies, such as Caffe and Faster R-CNN, are subject to their own respective licenses.
If you find LDDP useful in your research, please cite:
@article{azadi2017learning,
title={Learning Detection with Diverse Proposals},
author={Azadi, Samaneh and Feng, Jiashi and Darrell, Trevor},
journal={arXiv preprint arXiv:1704.03533},
year={2017}
}
Requirements and installation instructions are similar to Faster R-CNN, but we mention them again for your convenience.
- Requirements for
Caffe
andpycaffe
(see: Caffe installation instructions)
Note: Caffe must be built with support for Python layers!
# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
# Unrelatedly, it's also recommended that you use CUDNN
USE_CUDNN := 1
You can download my Makefile.config for reference.
- Python packages you might not have:
cython
,python-opencv
,easydict
Hardware requirements are similar to the those for running Faster R-CNN.
- Clone the LDDP repository
# Make sure to clone with --recursive
git clone --recursive https://github.com/azadis/LDDP.git
-
We'll call the directory that you cloned LDDP into
LDDP_ROOT
-
Build the Cython modules
cd $LDDP_ROOT/py-faster-rcnn/lib make
-
Build Caffe and pycaffe
cd $LDDP_ROOT/py-faster-rcnn/caffe-fast-rcnn # Now follow the Caffe installation instructions here: # http://caffe.berkeleyvision.org/installation.html # If you're experienced with Caffe and have all of the requirements installed # and your Makefile.config in place, then simply do: make -j8 && make pycaffe
-
Download the training, validation, test data and VOCdevkit
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
-
Extract all of these tars into one directory named
VOCdevkit
tar xvf VOCtrainval_06-Nov-2007.tar tar xvf VOCtest_06-Nov-2007.tar tar xvf VOCdevkit_08-Jun-2007.tar
-
It should have this basic structure
$VOCdevkit/ # development kit $VOCdevkit/VOCcode/ # VOC utility code $VOCdevkit/VOC2007 # image sets, annotations, etc. # ... and several other directories ...
-
Create symlinks for the PASCAL VOC dataset
cd $LDDP_ROOT/py-faster-rcnn/data ln -s $VOCdevkit VOCdevkit2007
Using symlinks is a good idea because you will likely want to share the same PASCAL dataset installation between multiple projects.
-
[Optional] follow similar steps to get PASCAL VOC 2010 and 2012.
-
[Optional] If you want to use COCO, please see the notes here.
-
Follow the next sections to download pre-trained ImageNet models.
Pre-trained ImageNet models can be downloaded for the three networks described in the paper: ZF and VGG16.
cd $LDDP_ROOT/py-faster-rcnn
./data/scripts/fetch_imagenet_models.sh
To train and test the LDDP end-to-end detection framework:
cd $LDDP_ROOT/py-faster-rcnn
./experiments/scripts/LDDP_end2end.sh [GPU_ID] [NET] [--set ...]
# GPU_ID is the GPU you want to train on
# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
# --set ... allows you to specify fast_rcnn.config options, e.g.
# --set EXP_DIR seed_rng1701 RNG_SEED 1701 TRAIN.SCALES [400,500,600,700]
Trained LDDP networks are saved under:
output/<experiment directory>/<dataset name>/
Test outputs are saved under:
output/<experiment directory>/<dataset name>/<network snapshot name>/
Semantic Similarity matrices used in the paper are stored as pickle files at:
$LDDP_ROOT/data
An example ipython script to generate semantic similarity matrices for PASCAL VOC and COCO data sets is located at:
$LDDP_ROOT/tools/Semantic_Similarity.ipynb