Multi-Process Fusion: Visual Place Recognition Using Multiple Image Processing Methods
Paper published in IEEE RAL, with cited reference: S. Hausler, A. Jacobson and M. Milford, "Multi-Process Fusion: Visual Place Recognition Using Multiple Image Processing Methods," in IEEE Robotics and Automation Letters, vol. 4, no. 2, pp. 1924-1931, April 2019. doi: 10.1109/LRA.2019.2898427
An open-source, pre-print version is also available on arXiv: https://arxiv.org/pdf/1903.03305.pdf
Copyright: Stephen Hausler
ChangeLog:
Currently at Revision 1.1.
191218: Added additional option to set the image resizing value for the CNN input. Set to 227 by 227 for HybridNet and 224 by 224 for Vgg-16. Also fixed an out-of-memory bug when running the previous version of this code in R2018b.
131218: Made a collection of changes in response to reviewer feedback. In particular, the worstID algorithm has changed.
190918: fixed bug that caused error at end of dataset immediately prior to printing the precision-recall curves.
Requirements:
- MATLAB 2017 or later.
- MATLAB Neural Network Toolbox and Image Processing Toolbox.
- Computer with stand-alone graphics hardware.
- A downloaded CNN caffemodel and prototxt (not included in this repository).
Getting Started:
- Obtain CNN model files. For HybridNet, permission must be attained from the original author. Other networks will also work, such as VGG-16 trained on Places365.
- To begin, launch MATLAB and open Multi_Process_Fusion.m.
- Edit the adjustable settings for your particular dataset. Requires a collection of individual images.
- Run the file. Note: the reference traverse will take several minutes with no intermediate feedback. Once the query traverse begins, a figure will display the recognition process.
Detailed instructions for testing on the St Lucia dataset:
- Download the caffemodel and prototxt for the CNN model you wish to use (recommend HybridNet or VGG-16 trained on Places365).
- The GPS .mat file for the St Lucia dataset is included in this repository.
- To download the St Lucia dataset, please go to https://wiki.qut.edu.au/display/cyphy/St+Lucia+Multiple+Times+of+Day and download "180809_1545" and "190809_0845".
- Extract individual frames out of the downloaded videos, for example, using Avconv on Ubuntu (https://libav.org/avconv.html).
- For "180809_1545", extract frames out of the video at 15FPS and limit to the first 4000 frames. Place these extracted images into a new folder containing just these images - this is the query dataset.
- For "190809_0845", extract frames out of the video at 15FPS and limit to the first 3945 frames. Place these extracted images into a new folder containing just these images - this is the reference dataset. The number of images is set such that there is only one query image per location and no double-ups. The code will still work with double-ups, however the performance will drop as the matching scoring algorithm will find these double-ups and assume that severe perceptual aliasing is present.
- Then edit "Multi_Process_Fusion.m" and rename "Ref_folder" and "Query_folder" to point to the file locations where you saved the reference and query dataset images. Also edit "GT_file" to point to the save location of the GPS .mat file.
- Edit "datafile" and "protofile" to point to the file locations where you saved your caffemodel and prototxt files. Then edit actLayer for the layer you wish to extract features from. Recommend setting to 15 for HybridNet and 24 for VGG-16.
- Other settings can be left as-is, however experimentation can be made by varying different settings, such as the minimum and maximum sequence length, the quality rate-of-change threshold, and the Rwindow value. The chosen CNN layer can also be changed for more experimentation.
Acknowledgements:
MATLAB Libaries: MATLAB;
patchNormalizeHMM: Niko Sunderhauf copyright 2013;
sort_nat: Douglas M. Schwarz copyright 2008;
Hybrid Net (not included in this release): Zetao Chen 2017.