[Bug] FP16 conversion yields an unusable model #17447

eddieevt-DXC · 2023-09-07T09:50:03Z

Describe the issue

I'm working with a model in Sagemaker (Resnet50 640x640, size [1, -1, -1, 3]) converted to ONNX. When trying to get more performance out of it by converting it to FP16, the conversion succeeds but trying to run the model gives this error:

E0907 08:27:25.823138 1379 model_lifecycle.cc:626] failed to load 'sagemaker' version 1: Internal: onnx runtime error 1: 
Load model from /models/sagemaker/1/model.onnx failed:Node (StatefulPartitionedCall/map/while_loop) Op (Loop) TypeInferenceError] 
Graph attribute inferencing failed: Node (Resize__59) Op (Resize) [ShapeInferenceError] 
Either `sizes` or `scales` must be provided, but not both of them

Trying out mixed precision instead fails at shape inferencing:

Traceback (most recent call last):
  File "/workspace/fp-16-onnx-converter.py", line 15, in <module>
    model_fp16 = auto_mixed_precision.auto_convert_mixed_precision(model, input_feed, rtol=0.01, atol=0.001, keep_io_types=True)
  File "/usr/local/lib/python3.10/dist-packages/onnxconverter_common/auto_mixed_precision.py", line 80, in auto_convert_mixed_precision
    if not run_attempt(node_names):
  File "/usr/local/lib/python3.10/dist-packages/onnxconverter_common/auto_mixed_precision.py", line 72, in run_attempt
    res1 = get_tensor_values_using_ort(model, feed_dict)
  File "/usr/local/lib/python3.10/dist-packages/onnxconverter_common/auto_mixed_precision.py", line 132, in get_tensor_values_using_ort
    sess = ort.InferenceSession(model.SerializeToString(), sess_options, providers=['CUDAExecutionProvider'])
  File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 383, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 426, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Node (StatefulPartitionedCall/map/while_loop) Op (Loop) [TypeInferenceError] Graph attribute inferencing failed: Node (Resize__59) Op (Resize) [ShapeInferenceError] Either `sizes` or `scales` must be provided, but not both of them

It gives the same error with the latest shape inferencing script from GitHub. I am not sure where I need to post this issue as multiple parts of the ONNX stack seem involved and not working.

Linking my onnxconverter-common issue here - #266.

To reproduce

FP16:

import onnx
from onnxconverter_common import float16

model = onnx.load("./model.onnx")
model_fp16 = float16.convert_float_to_float16(model)
onnx.checker.check_model(model_fp16) 
onnx.save(model_fp16, "./model_fp16.onnx")

or mixed precision:

from onnxconverter_common import auto_mixed_precision
import onnx
import numpy as np

input_feed = { "input_tensor": np.random.randint(0, 255, size=(1, 230, 150, 3), dtype=np.uint8) }

model = onnx.load("./model.onnx")
model_fp16 = auto_mixed_precision.auto_convert_mixed_precision(model, input_feed, rtol=0.01, atol=0.001, keep_io_types=True)
onnx.save(model_fp16, "./model_mixed.onnx")

Yes, model inputs are UINT8. I don't know why but it breaks TensorRT acceleration and conversion too. Considering the difference is mainly FP32 being used for normalized data and UINT8 is raw pixel data... There shouldn't be that big of a difference. I understand TRT not working but the CUDA provider should work and it doesn't.

Urgency

Urgent.

A whole pipeline is built around training these models and deploying them with all the supporting applications for a client.
The model is slower than necessary though and these optimizations would have helped. Model choice is also limited, as the training pipeline is in Sagemaker.

Platform

Linux

OS Version

Ubuntu 20.04 LTS

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.15.1

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

11.8

The text was updated successfully, but these errors were encountered:

skottmckay · 2023-09-07T10:30:14Z

Can you share the original model? It's not the CUDA execution provider that is failing, it's the ONNX validation of the model during loading.

Graph attribute inferencing failed: Node (Resize__59) Op (Resize) [ShapeInferenceError]
Either sizes or scales must be provided, but not both of them

The Resize spec says the node can only provide either the sizes or scales optional input, so the model is invalid as it does not conform to the ONNX spec. https://github.com/onnx/onnx/blob/main/docs/Operators.md#Resize

That makes the next question whether the original model was valid or not. What happens if you try and load the original model in onnxruntime? You don't need to run it - just create an InferenceSession with it and see if that is successful.

eddieevt-DXC · 2023-09-07T12:54:10Z

Sure, the model is SSD ResNet152 V1 FPN 640x640 (RetinaNet152). Sagemaker works with TF Zoo models.

The original UINT8/FP32 (input/outputs) converted to ONNX model works. Loading it in a session is fine, I've ran inference with it too. The FP16 converted one gives the same error when creating the session.

import onnxruntime

session = onnxruntime.InferenceSession('./model_fp16.onnx', providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])

Returns nothing for the full model and the following for the FP16 one:

Traceback (most recent call last):
  File "onnx-runtime-test.py", line 3, in <module>
    session = onnxruntime.InferenceSession('./model_fp16.onnx', providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
  File "/home/rootlab/triton_learning/yolo_v8/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 383, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/home/rootlab/triton_learning/yolo_v8/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 424, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from ./model_fp16.onnx failed:Node (StatefulPartitionedCall/map/while_loop) Op (Loop) [TypeInferenceError] Graph attribute inferencing failed: Node (Resize__59) Op (Resize) [ShapeInferenceError] Either `sizes` or `scales` must be provided, but not both of them

skottmckay · 2023-09-07T22:46:21Z

Given the original model works, but the converted one is invalid, it appears the issue is with the converter creating an invalid model rather than ONNX Runtime. As such, your onnxconverter-common issue would be the place to follow up.

github-actions bot added ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider labels Sep 7, 2023

skottmckay mentioned this issue Sep 7, 2023

FP16 conversion yields an unusable model microsoft/onnxconverter-common#266

Open

nistarlwc mentioned this issue Feb 18, 2024

tf.image.resize can't convert to FP16 model onnx/tensorflow-onnx#2305

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] FP16 conversion yields an unusable model #17447

[Bug] FP16 conversion yields an unusable model #17447

eddieevt-DXC commented Sep 7, 2023 •

edited

Loading

skottmckay commented Sep 7, 2023

eddieevt-DXC commented Sep 7, 2023 •

edited

Loading

skottmckay commented Sep 7, 2023

[Bug] FP16 conversion yields an unusable model #17447

[Bug] FP16 conversion yields an unusable model #17447

Comments

eddieevt-DXC commented Sep 7, 2023 • edited Loading

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

skottmckay commented Sep 7, 2023

eddieevt-DXC commented Sep 7, 2023 • edited Loading

skottmckay commented Sep 7, 2023

eddieevt-DXC commented Sep 7, 2023 •

edited

Loading

eddieevt-DXC commented Sep 7, 2023 •

edited

Loading