Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClearML Serving with Triton-GPU Renaming model.onnx to model.graphdef/model.bin #84

Open
InsertNamePls opened this issue Mar 17, 2025 · 1 comment

Comments

@InsertNamePls
Copy link

I'm encountering an issue when deploying a GPT-2 ONNX model using ClearML Serving with Triton. The deployment process renames my model.onnx file to model.graphdef, causing Triton to fail when loading the model since it's expecting model.onnx.

The error prevents Triton from starting properly, causing the Triton container to continuously restart.

The model file should retain its original name (model.onnx) when copied to the Triton model repository.
Triton should be able to find and load the ONNX model without any filename mismatches.

ClearML Serving renames model.onnx to model.graphdef before copying it to the Triton model repository (/models/gpt2_onnx/1/).
Triton fails to locate the expected model.onnx, resulting in an error and continuous container restarts.

E0317 13:21:35.385474 45 model_lifecycle.cc:596] failed to load 'gpt2_onnx' version 1: Internal: failed to stat file /models/gpt2_onnx/1/model.onnx

But earlier, ClearML Serving logs show:

copy model into /models/gpt2_onnx/1/model.graphdef

Checked ClearML Serving Model Upload Process

Used the following command to upload the model with the correct name:

clearml-serving --id 12e416036c4b4cd38b9fd3a46c85a583 model upload
--name "GPT2_ONNX" --project "GPT2-Serving"
--path ~/gpt/triton_models/gpt2_onnx/1/model.onnx

✅ Successfully uploaded model.onnx, as confirmed in the ClearML UI.
❌ However, during deployment, ClearML renamed it to model.graphdef.

clearml-serving --id 12e416036c4b4cd38b9fd3a46c85a583 model add
--engine triton --endpoint "gpt2_onnx"
--model-id 75159e2de62142fb9958e416807e3d1a
--preprocess preprocess.py
--aux-config platform="onnxruntime_onnx" max_batch_size=8 default_model_filename="model.onnx"

ERROR: You have default_model_filename in your config pbtxt, please remove it. It will be added automatically by the system.

Uploaded the Entire Model Directory as a ClearML Dataset
Tried Debugging Triton Container but It Restarts Too Fast

Any help on this issue would be appreciated!

@IlyaMescheryakov1402
Copy link
Contributor

Hi @InsertNamePls

Looks like Triton engine detects framework of your model as Tensorflow or Keras (https://github.com/clearml/clearml-serving/blob/main/clearml_serving/engines/triton/triton_helper.py#L172)

Try using something like

out_model = clearml.OutputModel(
    task=task,
    name=<NAME>,
    framework='ONNX'
)
out_model.update_weights_package(...)
clearml.OutputModel.wait_for_uploads()

according https://clear.ml/docs/latest/docs/references/sdk/model_outputmodel#class-outputmodel

or something like
clearml-serving --id 12e416036c4b4cd38b9fd3a46c85a583 model upload --name "GPT2_ONNX" --project "GPT2-Serving" --path ~/gpt/triton_models/gpt2_onnx/1/model.onnx --framework onnx

according https://clear.ml/docs/latest/docs/clearml_serving/clearml_serving_cli#upload

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants