ClearML Serving with Triton-GPU Renaming model.onnx to model.graphdef/model.bin #84

InsertNamePls · 2025-03-17T13:38:06Z

I'm encountering an issue when deploying a GPT-2 ONNX model using ClearML Serving with Triton. The deployment process renames my model.onnx file to model.graphdef, causing Triton to fail when loading the model since it's expecting model.onnx.

The error prevents Triton from starting properly, causing the Triton container to continuously restart.

The model file should retain its original name (model.onnx) when copied to the Triton model repository.
Triton should be able to find and load the ONNX model without any filename mismatches.

ClearML Serving renames model.onnx to model.graphdef before copying it to the Triton model repository (/models/gpt2_onnx/1/).
Triton fails to locate the expected model.onnx, resulting in an error and continuous container restarts.

E0317 13:21:35.385474 45 model_lifecycle.cc:596] failed to load 'gpt2_onnx' version 1: Internal: failed to stat file /models/gpt2_onnx/1/model.onnx

But earlier, ClearML Serving logs show:

copy model into /models/gpt2_onnx/1/model.graphdef

Checked ClearML Serving Model Upload Process

Used the following command to upload the model with the correct name:

clearml-serving --id 12e416036c4b4cd38b9fd3a46c85a583 model upload
--name "GPT2_ONNX" --project "GPT2-Serving"
--path ~/gpt/triton_models/gpt2_onnx/1/model.onnx

✅ Successfully uploaded model.onnx, as confirmed in the ClearML UI.
❌ However, during deployment, ClearML renamed it to model.graphdef.

clearml-serving --id 12e416036c4b4cd38b9fd3a46c85a583 model add
--engine triton --endpoint "gpt2_onnx"
--model-id 75159e2de62142fb9958e416807e3d1a
--preprocess preprocess.py
--aux-config platform="onnxruntime_onnx" max_batch_size=8 default_model_filename="model.onnx"

ERROR: You have default_model_filename in your config pbtxt, please remove it. It will be added automatically by the system.

Uploaded the Entire Model Directory as a ClearML Dataset
Tried Debugging Triton Container but It Restarts Too Fast

Any help on this issue would be appreciated!

IlyaMescheryakov1402 · 2025-03-19T23:09:26Z

Hi @InsertNamePls

Looks like Triton engine detects framework of your model as Tensorflow or Keras (https://github.com/clearml/clearml-serving/blob/main/clearml_serving/engines/triton/triton_helper.py#L172)

Try using something like

out_model = clearml.OutputModel(
    task=task,
    name=<NAME>,
    framework='ONNX'
)
out_model.update_weights_package(...)
clearml.OutputModel.wait_for_uploads()

according https://clear.ml/docs/latest/docs/references/sdk/model_outputmodel#class-outputmodel

or something like
clearml-serving --id 12e416036c4b4cd38b9fd3a46c85a583 model upload --name "GPT2_ONNX" --project "GPT2-Serving" --path ~/gpt/triton_models/gpt2_onnx/1/model.onnx --framework onnx

according https://clear.ml/docs/latest/docs/clearml_serving/clearml_serving_cli#upload

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ClearML Serving with Triton-GPU Renaming model.onnx to model.graphdef/model.bin #84

ClearML Serving with Triton-GPU Renaming model.onnx to model.graphdef/model.bin #84

InsertNamePls commented Mar 17, 2025

IlyaMescheryakov1402 commented Mar 19, 2025

ClearML Serving with Triton-GPU Renaming model.onnx to model.graphdef/model.bin #84

ClearML Serving with Triton-GPU Renaming model.onnx to model.graphdef/model.bin #84

Comments

InsertNamePls commented Mar 17, 2025

IlyaMescheryakov1402 commented Mar 19, 2025