This projects aims to reproduce the DRBox model architecture introduced in the paper Learning a Rotation Invariant Detector with Rotatable Bounding Box using Keras and Tensorflow as a backend. The original implementation of DRBox using Caffe by Lei Liu can be found here.
DRBox is used for detection tasks where the objects are orientated arbitrarily. This code show examples that DRBox is used to detect vehicles, ships and airplanes in remote sensing images, but can also be used for other tasks.
DRBox is a model inspired by the architecture of SSD : Single Shot MultiBox Detector. This project started from the implementation of the SSD model in keras by pierluigiferrari : SSD: Single-Shot MultiBox Detector implementation in Keras
The main goal of this project is to create a DRBox implementation that is well documented for those who are interested in a low-level understanding of the model. The provided documentation and detailed comments hopefully make it a bit easier to dig into the code and adapt or build upon the model.
Here are the Average Precision (AP) evaluation results of the original implementation and the evaluation results of a model trained using this implementation. "In all cases the results match those of the original Caffe models."
Original DRBox | Keras DRBox | |||||
Ship | Vehicle | Airplane | Ship | Vehicle | Airplane | |
AP 0.5 | 94.06 | 89.07 | 99.28 | 80.0 | 76.2 | 98.9 |
mAP 0.5 | 94.13 | 85.0 | ||||
AP 0.1 | 91.0 | 85.3 | 99.0 | |||
mAP 0.1 | 91.7 |
This model can achieve pretty good performance on the three different datasets. However, it cannot achieve State of the Art performance. There is a lack of precision in the localisation of object, thus the mAP 0.1 is 6% above the mAP 0.5.
This can be explained by several things :
- Pyramid input strategy is not implemented
- Hyper-parameters tuning could be improved
Below are some prediction examples of the fully trained Keras DRBox model.
- Python 3
- Numpy
- Keras
- Tensorflow
- PIL
- Open-CV
The Theano and CNTK backends are currently not supported.
In order to use docker, you need :
- docker
- docker-compose (at least 1.19 for GPU support)
- nvidia-docker2 (if you want to use a GPU)
The original dataset can be found here : https://pan.baidu.com/s/1sliHG09 To extract the files, you can use the following commands :
cat data.tar.gz*>data.tar.gz
tar -zxvf data.tar.gz
You will need to place the data in the data folder and run the following command to create the csv file containing the values for the bounding boxes of the specified objects :
python create_dataset.py -d data/Airplane/train_data/ -v 0.1 -t 0.1 -s -f data/Airplane -o a
DRBox is now designed as a single task network only. So you should train it for each type of objects separately. If you want to train an airplane detection network, you should create the csv files containing the data of the bounding boxes of the airplanes images, then you can start training by running :
python Training_DRBox.py
The trained models will be stored in the "trained_models" folder.
To make predictions using a trained model, named "plane.h5", on the test set, run :
python predict_DRBox.py -i data/Airplane/train_data/ -m trained_models/plane.h5 -l data/Airplane/labelstest.csv
To evaluate a trained model, named "plane.h5", on the 600 images of the test set, run :
python evaluation_DRBox.py -i data/Airplane/train_data/ -m trained_models/plane.h5 -l data/Airplane/labelstest.csv -n 600
In order to train a DRBox model from scratch, you can download the weights of the fully convolutionalized VGG-16 model trained to convergence on ImageNet classification here:
VGG_ILSVRC_16_layers_fc_reduced.h5
.
This is a direct port of the corresponding .caffemodel
file that is provided in the repository of the original Caffe
implementation. If you wish to use the pre-trained weights, you should put the file in the folder "VGG_weights".
To label new images you can use this repo : roLabelImg You will need to convert the output format but it is a great tool to label rotated bounding boxes.
The following things are on the to-do list, ranked by priority. Contributions are welcome, but please read the contributing guidelines.
- Implement the Pyramid-Input strategy
- Improve tuning of hyper-parameters
- Add model definitions and trained weights for DRBox based on other base networks such as MobileNet, InceptionResNetV2, or DenseNet.
The project is licensed under the Apache License 2.0
- Paul Pontisso : https://github.com/ppontisso