This repository contains the implementation of an end-to-end perception architecture for autonomous vehicles. The goal of this project is to extract semantic representations from multiple sensors and fuse them into a single "bird's-eye-view" coordinate frame for consumption by motion planning algorithms. The proposed architecture directly extracts a bird's-eye-view representation of a scene using image data from an arbitrary number of cameras.
The aim of perception for autonomous vehicles is to extract semantic representations from various sensors and integrate these representations into a unified "bird's-eye-view" coordinate frame. This project proposes a new end-to-end architecture that directly extracts a bird's-eye-view representation of a scene using image data from an arbitrary number of cameras.
The core idea is to "lift" each image individually into a frustum of features for each camera and then "splat" all the frustums into a rasterized bird's-eye-view grid. By training on the entire camera rig, the model learns how to represent images and fuse predictions from all cameras into a single cohesive representation of the scene, even in the presence of calibration errors.
Additionally, the representations inferred by the model enable interpretable end-to-end motion planning by "shooting" template trajectories into a bird's-eye-view cost map output by the network.
- End-to-end architecture for perception in autonomous vehicles.
- Extraction of bird's-eye-view representation from image data.
- Fusion of semantic representations from multiple camera sensors.
- Robustness to calibration error.
- Outperforms baselines and prior work in object segmentation and map segmentation tasks.
- Interpretable end-to-end motion planning using the inferred representations.
- Benchmarking against lidar-based models.
-
Clone this repository:
git clone https://github.com/ayushgoel24/birdseye-view-segmentation.git
-
Install the required dependencies:
pip install -r requirements.txt
TODO: Update the usage section
Environment 1 | Environment 2 |
This project is licensed under the MIT License.
For any inquiries or questions, please contact [ayush.goel2427@gmail.com].
Happy coding!