The channel refers to communicator.channel
, where the different types channel of communcation is implemented. To add,
a new one extend the BaseChannel
interface.
The triton_clients refers to trition_clients
, where the clients are implemented using available preprocess and postprocess.
Clients are dependent of postprocess
and preprocess
.
The inference takes channel and triton_clients object for performing inference on triton client model using given channel.
This repo is a ROS-gRPC interface for remote inferencing with an Inference server e.g. Triton Inference server. To run this client for inference you have to do following steps:
To read about how to setup a custom model repository, you can read the following Triton documentation Once you have setup the model repository, you can run the triton server with docker as follows:
For this setup, we tested it on the Triton server version 21.08. You can pull the docker image from NGC catalog.
docker pull nvcr.io/nvidia/tritonserver:22.04-py3
docker run --gpus 1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v/full/path/to/model_repository:/models nvcr.io/nvidia/tritonserver:22.04-py3 tritonserver --model-repository=/models --allow-metrics 1
You can use --gpus all
option to allocate all the available gpus to the Triton server. Once the triton server is up, you should see the available
models that we can use as arguments to the client. For example in the following image:,
we have two models named FCOS_detectron
and YOLOv5nCROP
and ready at the server.
We can pass these model names as arguments to the main.py
for inference.
Before you send sensor data to the Triton server, you need to select the following in ./data/client_parameter.yaml
file:
- ROS topic you want to subscribe from sensor input e.g.
sub_topic: '/camera/color/image_raw'
- ROS topic you want to publish as inference results e.g.
pub_topic: '/camera/color/detection'
- IP address of the triton server or localhost if running locally e.g.
grpc_channel: '10.249.3.13:8001'
You can select the model, you want to use for your inference and run the following command:
python3 main.py -m YOLOv5nCROP
Now your sensor input will be fed to the triton server as gRPC messages and the resulting inference image will be published as ROS topic which you can visualize in the ROS ecosystem with rviz.
To visualize the triton metrics on a grafana dashboard you need to download and run prometheus first download link (tested with version 2.35.0). You need to update its configuration .yaml file to provide triton backend as below:
- job_name: 'triton_backend'
static_configs:
- targets: ['localhost:8002']
For your reference there is a full configuration file provided here as well configuration. Now run prometheus as follows:
./prometheus --config.file=prometheus.yml
You can check if it is receiving the metrics from Triton at http://localhost:9090.
To run grafana dashboard, you can use its docker container. Simply run
docker run --net=host grafana/grafana:latest
Go to the URL http://localhost:3000. Login with username: admin and password admin. For first time setup, you need to provide the prometheus backend as data source. Use the following link for that.
- Restructuring the repo for adding new clients for new models (Big thanks to @Saurabh Kumar)
- Added example FCOS model from detectron
- Added an evaluation script from ROS vision messages.
- Added example configuration files for different triton models.
- Dockerize the whole client repo.
- Refactor the evaluation script.
- Variable gRPC inference message size according to size of the image.
- Ensemble mode with multiple models.
- Add a minio based model repository(optional).
This repo is heavily based on the work of following repositories, and is only used to setup as a demo of ROS-gRPC-Triton ecosystem. If you are interested in their work, give them a thumbs up here: