Demo code of "Guided Stereo Matching", Matteo Poggi, Davide Pallotti, Fabio Tosi and Stefano Mattoccia, CVPR 2019.
Copyright (c) 2019 University of Bologna. Patent pending. All rights reserved. Licensed under the CC BY-NC-SA 4.0 license (https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode ).
NOTE: This code is for demonstration purposes. We do not plan to release training code.
[Paper] - [Poster] - [Youtube Video]
@inproceedings{Poggi_CVPR_2019,
title = {Guided Stereo Matching},
author = {Poggi, Matteo and
Pallotti, Davide and
Tosi, Fabio and
Mattoccia, Stefano},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2019}
}
Stereo is a prominent technique to infer dense depth maps from images, and deep learning further pushed forward the state-of-the-art, making end-to-end architectures unrivaled when enough data is available for training. However, deep networks suffer from significant drops in accuracy when dealing with new environments. Therefore, in this paper, we introduce Guided Stereo Matching, a novel paradigm leveraging a small amount of sparse, yet reliable depth measurements retrieved from an external source enabling to ameliorate this weakness. The additional sparse cues required by our method can be obtained with any strategy (e.g., a LiDAR) and used to enhance features linked to corresponding disparity hypotheses. Our formulation is general and fully differentiable, thus enabling to exploit the additional sparse inputs in pre-trained deep stereo networks as well as for training a new instance from scratch. Extensive experiments on three standard datasets and two stateof-the-art deep architectures show that even with a small set of sparse input cues, i) the proposed paradigm enables significant improvements to pre-trained networks. Moreover, ii) training from scratch notably increases accuracy and robustness to domain shifts. Finally, iii) it is suited and effective even with traditional stereo algorithms such as SGM.
PyTorch 0.4
(recommended)python packages
such as opencv, PIL, numpy
Download KITTI demo sequence and pretrained models running
sh get_weights_and_data.sh
Launch the following command
python run.py --datapath [sequence_path] \
--loadmodel [model_path] \
--output_dir [output_path] \
--guided \
--display \
--save \
--verbose \
Optional arguments:
--guided
: enables guided stereo--display
: shows results on screen--save
: saves results inoutput_dir
--verbose
: prints single stereo pair stats
Results on the provided sequence 2011_09_26_0011
:
Model | bad2-All (%) | bad2-Nog (%) | MAE-All | MAE-Nog | Density (%) |
---|---|---|---|---|---|
PSMnet-ft | 1.71 | 1.73 | 0.72 | 0.72 | - |
PSMnet-ft-gd | 1.13 | 1.15 | 0.60 | 0.61 | 3.68 |
PSMnet-ft-gd-tr | 0.67 | 0.67 | 0.47 | 0.47 | 3.68 |
Qualitative results on Middlebury v3 sampling 5% hints from ground truth. From left to right, reference input image (a), disparity map by PSMNet (b), PSMNet-gd-tr (c) and ground truth (d).
m [dot] poggi [at] unibo [dot] it
Thanks to Jia-Ren Chang for sharing the original implementation of PSMNet: https://github.com/JiaRenChang/PSMNet