Collection of my personal summaries of Computer Vision and Deep Learning papers. The summaries are intended to be based solely on the original paper with just the key insights for my personal archive. I try to summarize at least one paper per week and organize relevant publications on each topic in a chronological order.
Point Clouds are among the most widely spread data representations in 3D Computer Vision. Generated by LiDAR sensors or RGB-D scanners, they provide highly accurate depth information. However, due to the continuous representation space, standard convolutions known from the image domain cannot be applied. Consequently, a variety of methods have been proposed that rely on either voxelization, Birds-Eye-View projection or direct processing of the continuous point cloud. Publications in Section 1 will focus on 3D and BEV object detection from point clouds with a focus on autonomous driving applications. Section 2 presents PointNet architectures designed to process the continous point cloud directly using MLPs.
- PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection (ArXiv 2020)
- PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud (CVPR 2019)
- VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection (CVPR 2018)
- Frustum PointNets for 3D Object Detection from RGB-D Data (CVPR 2018)
- PIXOR: Real-time 3D Object Detection from Point Clouds (CVPR 2018)
- Joint 3D Proposal Generation and Object Detection from View Aggregation (IROS 2018)
- Multi-View 3D Object Detection Network for Autonomous Driving (CVPR 2017)
- Deep Hough Voting for 3D Object Detection in Point Clouds (ICCV 2019)
- PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space (NIPS 2017)
- PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation (CVPR 2017)
The area of depth estimation is concerned with the extraction of depth information from either mono or stereo camera images. As all 3D Computer Vision tasks rely on accurate depth information, algorithms for depth estimation from image data are highly relevant to help close the performance gap between image-based methods and approaches that leverage information obtained by 3D-sensor such as LiDAR sensors or RGB-D scanners. Publications mainly focus on depth estimation in the context of 3D object detection for autonomous driving.
- Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving (ICLR 2020)
- Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving (CVPR 2019)
Being able to reliably assess the uncertainty of a prediction made by a deep learning model is highly important for the use in safety-critical scenarios. The field of uncertainty estimation focusses on the extraction of probability distributions rather than single point estimates by combining deep learning with the probabilistic bayesian framework. Publications include Bayesian Neural Networks as well as approximation strategies and calibration methods that ensure interpretable neural network outputs.