# DNN Inference Nodes for ROS/ROS2
This package contains DNN inference nodes and camera/video streaming nodes for ROS/ROS2 with support for NVIDIA **[Jetson Nano / TX1 / TX2 / Xavier / Orin](https://developer.nvidia.com/embedded-computing)** devices and TensorRT.

The nodes use the image recognition, object detection, and semantic segmentation DNN's from the [`jetson-inference`](https://github.com/dusty-nv/jetson-inference) library and NVIDIA [Hello AI World](https://github.com/dusty-nv/jetson-inference#hello-ai-world) tutorial, which come with several built-in pretrained networks for classification, detection, and segmentation and the ability to load customized user-trained models.

The camera & video streaming nodes support the following [input/output interfaces](https://github.com/dusty-nv/jetson-inference/blob/master/docs/aux-streaming.md):

* MIPI CSI cameras
* V4L2 cameras
* RTP / RTSP streams
* WebRTC streams
* Videos & Images
* Image sequences
* OpenGL windows

Various distribution of ROS are supported either from source or through containers (including Melodic, Noetic, Foxy, Galactic, Humble, and Iron).  The same branch supports both ROS1 and ROS2.

### Table of Contents

* [Installation](#installation)
* [Testing](#testing)
	* [Video Viewer](#video-viewer)
	* [imagenet Node](#imagenet-node)
	* [detectnet Node](#detectnet-node)
	* [segnet Node](#segnet-node)
* [Topics & Parameters](#topics-messages)
	* [imagenet Node](#imagenet-node-1)
	* [detectnet Node](#detectnet-node-1)
	* [segnet Node](#segnet-node-1) 
	* [video_source Node](#video_source-node)
	* [video_output Node](#video_output-node)

## Installation

The easiest way to get up and running is by cloning [jetson-inference](https://github.com/dusty-nv/jetson-inference) (which ros_deep_learning is a submodule of) and running the pre-built container, which automatically mounts the required model directories and devices:

``` bash
$ git clone --recursive --depth=1 https://github.com/dusty-nv/jetson-inference
$ cd jetson-inference
$ docker/run.sh --ros=humble  # noetic, foxy, galactic, humble, iron
```

> **note**: the ros_deep_learning nodes rely on data from the jetson-inference tree for storing models, so clone and mount `jetson-inference/data` if you're using your own container or source installation method.

The `--ros` argument to the [`docker/run.sh`](https://github.com/dusty-nv/jetson-inference/blob/master/docker/run.sh) script selects the ROS distro to use.  They in turn use the `ros:$ROS_DISTRO-pytorch` container images from [jetson-containers](https://github.com/dusty-nv/jetson-containers), which include jetson-inference and this.

For previous information about building the ros_deep_learning package for an uncontainerized ROS installation, expand the section below (the parts about installing ROS may require adapting for the particular version of ROS/ROS2 that you want to install)

<details>
<summary>Legacy Install Instructions</summary>

### jetson-inference

These ROS nodes use the DNN objects from the [`jetson-inference`](https://github.com/dusty-nv/jetson-inference) project (aka Hello AI World).  To build and install jetson-inference, see [this page](https://github.com/dusty-nv/jetson-inference/blob/master/docs/building-repo-2.md) or run the commands below:

```bash
$ cd ~
$ sudo apt-get install git cmake
$ git clone --recursive --depth=1 https://github.com/dusty-nv/jetson-inference
$ cd jetson-inference
$ mkdir build
$ cd build
$ cmake ../
$ make -j$(nproc)
$ sudo make install
$ sudo ldconfig
```
Before proceeding, it's worthwhile to test that `jetson-inference` is working properly on your system by following this step of the Hello AI World tutorial:
* [Classifying Images with ImageNet](https://github.com/dusty-nv/jetson-inference/blob/master/docs/imagenet-console-2.md)

### ROS/ROS2

Install the `ros-melodic-ros-base` or `ros-eloquent-ros-base` package on your Jetson following these directions:

* ROS Melodic - [ROS Install Instructions](http://wiki.ros.org/melodic/Installation/Ubuntu)
* ROS2 Eloquent - [ROS2 Install Instructions](https://index.ros.org/doc/ros2/Installation/Eloquent/Linux-Install-Debians/)

Depending on which version of ROS you're using, install some additional dependencies and create a workspace:

#### ROS Melodic
```bash
$ sudo apt-get install ros-melodic-image-transport ros-melodic-vision-msgs
```

For ROS Melodic, create a Catkin workspace (`~/ros_workspace`) using these steps:  
http://wiki.ros.org/ROS/Tutorials/InstallingandConfiguringROSEnvironment#Create_a_ROS_Workspace

#### ROS Eloquent
```bash
$ sudo apt-get install ros-eloquent-vision-msgs \
                       ros-eloquent-launch-xml \
                       ros-eloquent-launch-yaml \
                       python3-colcon-common-extensions
```

For ROS Eloquent, create a workspace (`~/ros_workspace`) to use:

```bash
$ mkdir -p ~/ros2_example_ws/src
```

### ros_deep_learning

Next, navigate into your ROS workspace's `src` directory and clone `ros_deep_learning`:

```bash
$ cd ~/ros_workspace/src
$ git clone https://github.com/dusty-nv/ros_deep_learning
```

Then build it - if you are using ROS Melodic, use `catkin_make`.  If you are using ROS2 Eloquent, use `colcon build`:

```bash
$ cd ~/ros_workspace/

# ROS Melodic
$ catkin_make
$ source devel/setup.bash 

# ROS2 Eloquent
$ colcon build
$ source install/local_setup.bash 
```

The nodes should now be built and ready to use.  Remember to source the overlay as shown above so that ROS can find the nodes.

</details>

## Testing

Before proceeding, if you're using ROS Melodic make sure that `roscore` is running first:

```bash
$ roscore
```

If you're using ROS2, running the core service is no longer required.

### Video Viewer

First, it's recommended to test that you can stream a video feed using the [`video_source`](#video-source-node) and [`video_output`](#video-output-node) nodes.  See [Camera Streaming & Multimedia](https://github.com/dusty-nv/jetson-inference/blob/master/docs/aux-streaming.md) for valid input/output streams, and substitute your desired `input` and `output` argument below.  For example, you can use video files for the input or output, or use V4L2 cameras instead of MIPI CSI cameras.  You can also use RTP/RTSP streams over the network.

```bash
# ROS
$ roslaunch ros_deep_learning video_viewer.ros1.launch input:=csi://0 output:=display://0

# ROS2
$ ros2 launch ros_deep_learning video_viewer.ros2.launch input:=csi://0 output:=display://0
```

### imagenet Node

You can launch a classification demo with the following commands - substitute your desired camera or video path to the `input` argument below (see [here](https://github.com/dusty-nv/jetson-inference/blob/master/docs/aux-streaming.md) for valid input/output streams).  

Note that the `imagenet` node also publishes classification metadata on the `imagenet/classification` topic in a [`vision_msgs/Detection2DArray`](http://docs.ros.org/melodic/api/vision_msgs/html/msg/Detection2DArray.html) message -- see the [Topics & Parameters](#imagenet-node-1) section below for more info.

```bash
# ROS
$ roslaunch ros_deep_learning imagenet.ros1.launch input:=csi://0 output:=display://0

# ROS2
$ ros2 launch ros_deep_learning imagenet.ros2.launch input:=csi://0 output:=display://0
```

### detectnet Node

To launch an object detection demo, substitute your desired camera or video path to the `input` argument below (see [here](https://github.com/dusty-nv/jetson-inference/blob/master/docs/aux-streaming.md) for valid input/output streams).  Note that the `detectnet` node also publishes the metadata in a `vision_msgs/Detection2DArray` message -- see the [Topics & Parameters](#detectnet-node-1) section below for more info.

#### 

```bash
# ROS
$ roslaunch ros_deep_learning detectnet.ros1.launch input:=csi://0 output:=display://0

# ROS2
$ ros2 launch ros_deep_learning detectnet.ros2.launch input:=csi://0 output:=display://0
```

### segnet Node

To launch a semantic segmentation demo, substitute your desired camera or video path to the `input` argument below (see [here](https://github.com/dusty-nv/jetson-inference/blob/master/docs/aux-streaming.md) for valid input/output streams).  Note that the `segnet` node also publishes raw segmentation results to the `segnet/class_mask` topic -- see the [Topics & Parameters](#segnet-node-1) section below for more info.

```bash
# ROS
$ roslaunch ros_deep_learning segnet.ros1.launch input:=csi://0 output:=display://0

# ROS2
$ ros2 launch ros_deep_learning segnet.ros2.launch input:=csi://0 output:=display://0
```

## Topics & Parameters

Below are the message topics and parameters that each node implements.

### imagenet Node

| Topic Name     |   I/O  | Message Type                                                                                                 | Description                                           |
|----------------|:------:|--------------------------------------------------------------------------------------------------------------|-------------------------------------------------------|
| image_in       |  Input | [`sensor_msgs/Image`](http://docs.ros.org/melodic/api/sensor_msgs/html/msg/Image.html)                       | Raw input image                                       |
| classification | Output | [`vision_msgs/Classification2D`](http://docs.ros.org/melodic/api/vision_msgs/html/msg/Classification2D.html) | Classification results (class ID + confidence)        |
| vision_info    | Output | [`vision_msgs/VisionInfo`](http://docs.ros.org/melodic/api/vision_msgs/html/msg/VisionInfo.html)             | Vision metadata (class labels parameter list name)         |
| overlay        | Output | [`sensor_msgs/Image`](http://docs.ros.org/melodic/api/sensor_msgs/html/msg/Image.html)                       | Input image overlayed with the classification results |

| Parameter Name    |       Type       |    Default    | Description                                                                                                        |
|-------------------|:----------------:|:-------------:|--------------------------------------------------------------------------------------------------------------------|
| model_name        |     `string`     | `"googlenet"` | Built-in model name (see [here](https://github.com/dusty-nv/jetson-inference#image-recognition) for valid values)  |
| model_path        |     `string`     |      `""`     | Path to custom caffe or ONNX model                                                                                 |
| prototxt_path     |     `string`     |      `""`     | Path to custom caffe prototxt file                                                                                 |
| input_blob        |     `string`     |    `"data"`   | Name of DNN input layer                                                                                            |
| output_blob       |     `string`     |    `"prob"`   | Name of DNN output layer                                                                                           |
| class_labels_path |     `string`     |      `""`     | Path to custom class labels file                                                                                   |
| class_labels_HASH | `vector<string>` |  class names  | List of class labels, where HASH is model-specific (actual name of parameter is found via the `vision_info` topic) |

### detectnet Node

| Topic Name  |   I/O  | Message Type                                                                                                 | Description                                                |
|-------------|:------:|--------------------------------------------------------------------------------------------------------------|------------------------------------------------------------|
| image_in    |  Input | [`sensor_msgs/Image`](http://docs.ros.org/melodic/api/sensor_msgs/html/msg/Image.html)                       | Raw input image                                            |
| detections  | Output | [`vision_msgs/Detection2DArray`](http://docs.ros.org/melodic/api/vision_msgs/html/msg/Detection2DArray.html) | Detection results (bounding boxes, class IDs, confidences) |
| vision_info | Output | [`vision_msgs/VisionInfo`](http://docs.ros.org/melodic/api/vision_msgs/html/msg/VisionInfo.html)             | Vision metadata (class labels parameter list name)         |
| overlay     | Output | [`sensor_msgs/Image`](http://docs.ros.org/melodic/api/sensor_msgs/html/msg/Image.html)                       | Input image overlayed with the detection results      |

| Parameter Name    |       Type       |        Default       | Description                                                                                                        |
|-------------------|:----------------:|:--------------------:|--------------------------------------------------------------------------------------------------------------------|
| model_name        |     `string`     | `"ssd-mobilenet-v2"` | Built-in model name (see [here](https://github.com/dusty-nv/jetson-inference#object-detection) for valid values)   |
| model_path        |     `string`     |         `""`         | Path to custom caffe or ONNX model                                                                                 |
| prototxt_path     |     `string`     |         `""`         | Path to custom caffe prototxt file                                                                                 |
| input_blob        |     `string`     |       `"data"`       | Name of DNN input layer                                                                                            |
| output_cvg        |     `string`     |     `"coverage"`     | Name of DNN output layer (coverage/scores)                                                                         |
| output_bbox       |     `string`     |      `"bboxes"`      | Name of DNN output layer (bounding boxes)                                                                          |
| class_labels_path |     `string`     |         `""`         | Path to custom class labels file                                                                                   |
| class_labels_HASH | `vector<string>` |      class names     | List of class labels, where HASH is model-specific (actual name of parameter is found via the `vision_info` topic) |
| overlay_flags     |     `string`     |  `"box,labels,conf"` | Flags used to generate the overlay (some combination of `none,box,labels,conf`)                                    |
| mean_pixel_value  |      `float`     |          0.0         | Mean pixel subtraction value to be applied to input (normally 0)                                                   |
| threshold         |      `float`     |          0.5         | Minimum confidence value for positive detections (0.0 - 1.0)                                                       |

### segnet Node

| Topic Name  |   I/O  | Message Type                                                                                     | Description                                              |
|-------------|:------:|--------------------------------------------------------------------------------------------------|----------------------------------------------------------|
| image_in    |  Input | [`sensor_msgs/Image`](http://docs.ros.org/melodic/api/sensor_msgs/html/msg/Image.html)           | Raw input image                                          |
| vision_info | Output | [`vision_msgs/VisionInfo`](http://docs.ros.org/melodic/api/vision_msgs/html/msg/VisionInfo.html) | Vision metadata (class labels parameter list name)       |
| overlay     | Output | [`sensor_msgs/Image`](http://docs.ros.org/melodic/api/sensor_msgs/html/msg/Image.html)           | Input image overlayed with the classification results    |
| color_mask  | Output | [`sensor_msgs/Image`](http://docs.ros.org/melodic/api/sensor_msgs/html/msg/Image.html)           | Colorized segmentation class mask out                    |
| class_mask  | Output | [`sensor_msgs/Image`](http://docs.ros.org/melodic/api/sensor_msgs/html/msg/Image.html)           | 8-bit single-channel image where each pixel is a classID |

| Parameter Name    |       Type       |                Default               | Description                                                                                                           |
|-------------------|:----------------:|:------------------------------------:|-----------------------------------------------------------------------------------------------------------------------|
| model_name        |     `string`     | `"fcn-resnet18-cityscapes-1024x512"` | Built-in model name (see [here](https://github.com/dusty-nv/jetson-inference#semantic-segmentation) for valid values) |
| model_path        |     `string`     |                 `""`                 | Path to custom caffe or ONNX model                                                                                    |
| prototxt_path     |     `string`     |                 `""`                 | Path to custom caffe prototxt file                                                                                    |
| input_blob        |     `string`     |               `"data"`               | Name of DNN input layer                                                                                               |
| output_blob       |     `string`     |        `"score_fr_21classes"`        | Name of DNN output layer                                                                                              |
| class_colors_path |     `string`     |                 `""`                 | Path to custom class colors file                                                                                      |
| class_labels_path |     `string`     |                 `""`                 | Path to custom class labels file                                                                                      |
| class_labels_HASH | `vector<string>` |              class names             | List of class labels, where HASH is model-specific (actual name of parameter is found via the `vision_info` topic)    |
| mask_filter       |     `string`     |              `"linear"`              | Filtering to apply to color_mask topic (`linear` or `point`)                                                          |
| overlay_filter    |     `string`     |              `"linear"`              | Filtering to apply to overlay topic (`linear` or `point`)                                                             |
| overlay_alpha     |      `float`     |                `180.0`               | Alpha blending value used by overlay topic (0.0 - 255.0)                                                              |

### video_source Node

| Topic Name |   I/O  | Message Type                                                                           | Description             |
|------------|:------:|----------------------------------------------------------------------------------------|-------------------------|
| raw        | Output | [`sensor_msgs/Image`](http://docs.ros.org/melodic/api/sensor_msgs/html/msg/Image.html) | Raw output image (BGR8) |

| Parameter      |   Type   |   Default   | Description                                                                                                                                                               |
|----------------|:--------:|:-----------:|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| resource       | `string` | `"csi://0"` | Input stream URI (see [here](https://github.com/dusty-nv/jetson-inference/blob/master/docs/aux-streaming.md#input-streams) for valid protocols)                           |
| codec          | `string` |     `""`    | Manually specify codec for compressed streams (see [here](https://github.com/dusty-nv/jetson-inference/blob/master/docs/aux-streaming.md#input-options) for valid values) |
| width          |   `int`  |      0      | Manually specify desired width of stream (0 = stream default)                                                                                                             |
| height         |   `int`  |      0      | Manually specify desired height of stream (0 = stream default)                                                                                                            |
| framerate      |   `int`  |      0      | Manually specify desired framerate of stream (0 = stream default)                                                                                                         |
| loop           |   `int`  |      0      | For video files:  `0` = don't loop, `>0` = # of loops, `-1` = loop forever                                                                                                |
| flip           | `string` |    `""`     | Set the flip method for MIPI CSI cameras (see [here](https://github.com/dusty-nv/jetson-inference/blob/master/docs/aux-streaming.md#input-options) for valid values)      |
### video_output Node

| Topic Name |  I/O  | Message Type                                                                           | Description     |
|------------|:-----:|----------------------------------------------------------------------------------------|-----------------|
| image_in   | Input | [`sensor_msgs/Image`](http://docs.ros.org/melodic/api/sensor_msgs/html/msg/Image.html) | Raw input image |

| Parameter      |   Type   |     Default     | Description                                                                                                                                                   |
|----------------|:--------:|:---------------:|---------------------------------------------------------------------------------------------------------------------------------------------------------------|
| resource       | `string` | `"display://0"` | Output stream URI (see [here](https://github.com/dusty-nv/jetson-inference/blob/master/docs/aux-streaming.md#output-streams) for valid protocols)             |
| codec          | `string` |     `"h264"`    | Codec used for compressed streams (see [here](https://github.com/dusty-nv/jetson-inference/blob/master/docs/aux-streaming.md#input-options) for valid values) |
| bitrate        |   `int`  |     4000000     | Target VBR bitrate of encoded streams (in bits per second)                                                                                                    |