onnx example not working with triton #224

Naegionn · 2022-06-29T14:14:36Z

/kind bug

What steps did you take and what happened:
I closely followed https://github.com/kserve/kserve/blob/release-0.8/docs/samples/v1beta1/onnx/README.md
to test kserve with the onnx model provided ( storageUri: "gs://kfserving-examples/onnx/style") as I want to use kserve with .onnx models.

The problem here is that the style model provided is a single .onnx file which was needed for the onnx runtime i suppose.
Now with onnx runtime replaced by triton this exampel does not work anymore as triton will not load from a single onnx file (/mnt/models/model.onnx).

Log from the triton server in kserve-container:


+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0629 14:04:33.356075 1 server.cc:546]
+-------------+-----------------------------------------------------------------+--------+
| Backend     | Path                                                            | Config |
+-------------+-----------------------------------------------------------------+--------+
| pytorch     | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so         | {}     |
| tensorflow  | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {}     |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {}     |
| openvino    | /opt/tritonserver/backends/openvino/libtriton_openvino.so       | {}     |
+-------------+-----------------------------------------------------------------+--------+

I0629 14:04:33.356082 1 server.cc:589]
+-------+---------+--------+
| Model | Version | Status |
+-------+---------+--------+
+-------+---------+--------+

I0629 14:04:33.356195 1 tritonserver.cc:1836]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                  |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                 |
| server_version                   | 2.14.0                                                                                                                                                                                 |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0]         | /mnt/models                                                                                                                                                                            |
| model_control_mode               | MODE_NONE                                                                                                                                                                              |
| strict_model_config              | 1                                                                                                                                                                                      |
| rate_limit                       | OFF                                                                                                                                                                                    |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                              |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                                               |
| min_supported_compute_capability | 6.0                                                                                                                                                                                    |
| strict_readiness                 | 1                                                                                                                                                                                      |
| exit_timeout                     | 30                                                                                                                                                                                     |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Triton requires a the onnx file to arranged something like this:

my_model/
  -- my_model.pbtxt
  -- 1/
       -- my_model.onnx

After creating my own triton compatible model triton loaded it correctly:

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "my_model"
spec:
  predictor:
    minReplicas: 0
    onnx:
      storageUri: "http://192.168.4.252:8000/my_model.tar.gz"

| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0629 13:48:31.691292 1 server.cc:546]
+-------------+-----------------------------------------------------------------+--------+
| Backend     | Path                                                            | Config |
+-------------+-----------------------------------------------------------------+--------+
| pytorch     | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so         | {}     |
| tensorflow  | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {}     |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {}     |
| openvino    | /opt/tritonserver/backends/openvino/libtriton_openvino.so       | {}     |
+-------------+-----------------------------------------------------------------+--------+

I0629 13:48:31.691309 1 server.cc:589]
+------------+---------+--------+
| Model      | Version | Status |
+------------+---------+--------+
| my_model | 1       | READY  |
+------------+---------+--------+

I0629 13:48:31.691780 1 tritonserver.cc:1836]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                  |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                 |
| server_version                   | 2.14.0                                                                                                                                                                                 |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0]         | /mnt/models                                                                                                                                                                            |
| model_control_mode               | MODE_NONE                                                                                                                                                                              |
| strict_model_config              | 1                                                                                                                                                                                      |
| rate_limit                       | OFF                                                                                                                                                                                    |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                              |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                                               |
| min_supported_compute_capability | 6.0                                                                                                                                                                                    |
| strict_readiness                 | 1                                                                                                                                                                                      |
| exit_timeout                     | 30                                                                                                                                                                                     |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

So I assume that the example needs to be updated.

Environment:

Istio Version:
Knative Version:
KFServing Version: 0.8
Kfdef:[k8s_istio/istio_dex/gcp_basic_auth/gcp_iap/aws/aws_cognito/ibm]
Kubernetes version: (use kubectl version): 1.22 (microk8s)
OS (e.g. from /etc/os-release):

The text was updated successfully, but these errors were encountered:

yuzisun · 2022-06-30T01:32:12Z

@Naegionn thanks for trying this out! Yes the example should be updated, are you interested in contribute the onnx example ?

Naegionn · 2022-06-30T11:53:59Z

Sure, if I find the time to do that next week.

tama-biro · 2022-07-26T10:09:23Z

Hi, wondering if there is an update on this. I'm getting 400 errors when trying to run the ONNX example. Thanks!

Naegionn · 2022-08-02T12:40:52Z

Sorry I did not have time to update the example

kserve-oss-bot added the kind/bug label Jun 29, 2022

Naegionn changed the title ~~onnx example not wotking with triton~~ onnx example not working with triton Jun 29, 2022

yuzisun added area/docs and removed kind/bug labels Jul 3, 2022

kserve-oss-bot added the kind/bug label Feb 4, 2023

yuzisun transferred this issue from kserve/kserve Feb 4, 2023

yuzisun added this to KServe 0.11 Apr 9, 2023

yuzisun assigned andyi2it Jul 2, 2023

yuzisun moved this to In Progress in KServe 0.11 Jul 2, 2023

andyi2it mentioned this issue Jul 31, 2023

Onnx docs update #275

Merged

oss-prow-bot bot closed this as completed in #275 Jan 27, 2024

github-project-automation bot moved this from In Progress to Done in KServe 0.11 Jan 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

onnx example not working with triton #224

onnx example not working with triton #224

Naegionn commented Jun 29, 2022

yuzisun commented Jun 30, 2022

Naegionn commented Jun 30, 2022

tama-biro commented Jul 26, 2022

Naegionn commented Aug 2, 2022

onnx example not working with triton #224

onnx example not working with triton #224

Comments

Naegionn commented Jun 29, 2022

yuzisun commented Jun 30, 2022

Naegionn commented Jun 30, 2022

tama-biro commented Jul 26, 2022

Naegionn commented Aug 2, 2022