Note: This project was generated with the assistance of artificial intelligence.
This ROS2 project implements monocular depth estimation using the ZoeDepth model. It provides real-time depth estimation from single RGB images.
- Install dependencies:
rosdep install -i --from-path src --rosdistro humble -y --ignore-src
- Build the workspace:
cd ~/ros2_ws
colcon build --symlink-install
source install/setup.bash
- Start the webcam publisher node:
ros2 run zoedepth webcam_publisher --ros-args -p device_id:=0
- In another terminal, start the depth estimator node:
ros2 run zoedepth depth_estimator --ros-args -p compiler_backend:='aot_eager'
Both nodes support various parameters that can be set via the command line:
Webcam Publisher Parameters:
device_id
(default: 0): Webcam device ID or pathtarget_width
(default: 256): Target width for resizingtarget_height
(default: 256): Target height for resizingforce_square_crop
(default: false): Force square output by cropping to shortest dimension before resizingpublish_rate
(default: 15.0): Publishing rate in Hz
Depth Estimator Parameters:
model_repo
(default: 'isl-org/ZoeDepth'): Model repositorymodel_type
(default: 'NK'): Model type (N, K, or NK)normalize_depth
(default: false): Whether to normalize depth output to 0-255 rangecolorize_output
(default: false): Whether to apply colorization to the depth map using magma colormapmeasure_latency
(default: false): Whether to measure and log processing latencyuse_compiler
(default: true): Whether to use PyTorch's compilercompiler_backend
(default: 'inductor'): Compiler backend to use. Options:- 'inductor': Default PyTorch 2.0 compiler
- 'eager': Traditional PyTorch eager execution
- 'aot_eager': Ahead-of-time compilation with eager execution
- 'tensorrt': TensorRT acceleration (requires torch-tensorrt package)
Note: To use the TensorRT backend, you must first install the torch-tensorrt package:
pip3 install torch-tensorrt
Performance measurements were conducted on an NVIDIA RTX 4070 GPU. Initial testing shows:
- Average latency of ~120ms per frame at 8Hz with no compiler backend
- Similar performance with Inductor or TensorRT backend in initial tests, so no visible acceleration
- Testing on NVIDIA Jetson Orin platforms is planned
These numbers are preliminary and may vary based on your specific hardware configuration and input resolution.
/image_raw
(sensor_msgs/Image): Raw RGB images from webcam/depth/image_raw
(sensor_msgs/Image): Estimated depth maps
This project is licensed under the MIT License - see the LICENSE file for details.