INVALID_ARGUMENT : Non-zero status code returned while running Conv node. Invalid input shape: {0,0} after successful convertation #4071

zeleniyslonik · 2020-05-28T10:53:23Z

Describe the bug
This bug came up when trying to convert the mask_refine part of SiamMask network from https://github.com/STVIR/pysot (Pytorch -> ONNX)
I`ve encountered a need to split the network into two parts and convert the mask_refine part of it into another ONNX file. However, during inference I get the following error:

2020-05-28 12:19:08.281828359 [E:onnxruntime:, sequential_executor.cc:183 Execute] Non-zero status code returned while running Conv node. Name:'Conv_149' Status Message: Invalid input shape: {0,0}
Traceback (most recent call last):
  File "/home/zeleniyslonik/PycharmProjects/onnx_siammask_tracker_python/track_on_video_onnx.py", line 230, in <module>
    processor.Execute()
  File "/home/zeleniyslonik/PycharmProjects/onnx_siammask_tracker_python/track_on_video_onnx.py", line 103, in Execute
    outputs = self.tracker.track(roiFrame)
  File "/home/zeleniyslonik/PycharmProjects/onnx_siammask_tracker_python/tracker/siammask_tracker.py", line 325, in track
    mask = self.sess_mask.run(None, {self.input_name_mask_1: xf_refine_1,
  File "/usr/lib/python3.8/site-packages/onnxruntime/capi/session.py", line 142, in run
    return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running Conv node. Name:'Conv_149' Status Message: Invalid input shape: {0,0}

The name of a layer and shape in the Status Message can vary. I believe, that this error can be linked to the TraceWarnings I get during conversion:

/home/zeleniyslonik/.local/lib/python3.8/site-packages/torch/tensor.py:464: RuntimeWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
warnings.warn('Iterating over a tensor might cause the trace to be incorrect. '

/home/zeleniyslonik/PycharmProjects/pysot_tracker_standalone/pysot/models/head/mask.py:77: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  p0 = F.pad(f[0], [16, 16, 16, 16])[:, :, 4*pos[0]:4*pos[0]+61, 4*pos[1]:4*pos[1]+61]
/home/zeleniyslonik/PycharmProjects/pysot_tracker_standalone/pysot/models/head/mask.py:78: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  p1 = F.pad(f[1], [8, 8, 8, 8])[:, :, 2*pos[0]:2*pos[0]+31, 2*pos[1]:2*pos[1]+31]
/home/zeleniyslonik/PycharmProjects/pysot_tracker_standalone/pysot/models/head/mask.py:79: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  p2 = F.pad(f[2], [4, 4, 4, 4])[:, :, pos[0]:pos[0]+15, pos[1]:pos[1]+15]

Urgency
No dedlines, but it would be nice to solve this ASAP

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux PC 5.4.40-1-MANJARO Set up CI with Azure Pipelines #1 SMP PREEMPT Sun May 10 14:17:40 UTC 2020 x86_64 GNU/Linux
ONNX Runtime installed from (source or binary): source
ONNX Runtime version: 1.2.0-5
Python version: 3.8
Visual Studio version (if applicable): None
GCC/Compiler version (if compiling from source): 9.3.0 (Arch Linux 9.3.0-1)
CUDA/cuDNN version: 10.2.89-5/7.6.5.32-4
GPU model and memory: NVidia GeForce GTX 1050 Ti

To Reproduce
I am attaching the models with the code to convert and to reproduce the issue.
Model: https://drive.google.com/open?id=1dWgAsTu6ivHVMMcIbF_KQ8wZRXApP8fe
Put it into the PySOT project main folder.
Code to convert:

import torch.nn as nn
import torch.onnx

from pysot.core.config import cfg
from pysot.models.model_builder import ModelBuilder

config = 'siammask_r50_l3/config.yaml'
snapshot = 'siammask_r50_l3/model.pth'

cfg.merge_from_file(config)
cfg.CUDA = torch.cuda.is_available() and cfg.CUDA
device = torch.device('cuda' if cfg.CUDA else 'cpu')

class ConvertModel(nn.Module):
    def __init__(self, model):
        super(ConvertModel, self).__init__()
        self.model = model

    def forward(self, x1, x2, x3, mask_corr, pos):
        mask_refine = self.model.refine_head([x1, x2, x3], mask_corr, pos)
        return mask_refine

model0 = ModelBuilder()
model0.load_state_dict(torch.load(snapshot, map_location=lambda storage, loc: storage.cpu()))

model0.eval()
# print(model0)
model = ConvertModel(model0)

self_xf_dummy_0 = torch.randn(1, 64, 125, 125)
self_xf_dummy_1 = torch.randn(1, 256, 63, 63)
self_xf_dummy_2 = torch.randn(1, 512, 31, 31)
mask_corr_feature_dummy = torch.randn(1, 256, 25, 25)
pos_dummy = torch.randint(low=0, high=20, size=(2,))

torch_out = torch.onnx._export(model, (self_xf_dummy_0, self_xf_dummy_1, self_xf_dummy_2, mask_corr_feature_dummy, pos_dummy),
                                       "siammask_r50_l3/siammask_mask_refine.onnx",
                                       export_params=True, opset_version=11, verbose=True)

Code to reproduce:

import numpy as np
import onnxruntime as rt

self_xf_dummy_0 = np.random.randn(1, 64, 125, 125)
self_xf_dummy_1 = np.random.randn(1, 256, 63, 63)
self_xf_dummy_2 = np.random.randn(1, 512, 31, 31)
mask_corr_feature_dummy = np.random.randn(1, 256, 25, 25)

sess_mask = rt.InferenceSession("siammask_r50_l3/siammask_mask_refine.onnx")

input_name_mask_1 = sess_mask.get_inputs()[0].name
input_name_mask_2 = sess_mask.get_inputs()[1].name
input_name_mask_3 = sess_mask.get_inputs()[2].name
input_name_mask_4 = sess_mask.get_inputs()[3].name
input_name_mask_5 = sess_mask.get_inputs()[4].name

outs_mask = sess_mask.run(None, {input_name_mask_1: self_xf_dummy_0,
                                 input_name_mask_2: self_xf_dummy_1,
                                 input_name_mask_3: self_xf_dummy_2,
                                 input_name_mask_4: mask_corr_feature_dummy,
                                 input_name_mask_5: np.asarray((20,20))})

mask = outs_mask
print(mask.shape)

Launch those snippets from the PySOT main folder.

Expected behavior
I expect the model to return a numpy array with shapes (1, 16129)

Screenshots
None

Additional context
Add any other context about the problem here. If the issue is about a particular model, please share the model details as well to facilitate debugging.

The text was updated successfully, but these errors were encountered:

stale · 2020-07-27T12:50:08Z

This issue has been automatically marked as stale due to inactivity and will be closed in 7 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

stale · 2020-08-08T02:57:27Z

This issue has been automatically closed due to inactivity. Please reactivate if further support is needed.

isgursoy · 2022-03-13T00:12:12Z

following

stale bot added the wontfix label Jul 27, 2020

stale bot closed this as completed Aug 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

INVALID_ARGUMENT : Non-zero status code returned while running Conv node. Invalid input shape: {0,0} after successful convertation #4071

INVALID_ARGUMENT : Non-zero status code returned while running Conv node. Invalid input shape: {0,0} after successful convertation #4071

zeleniyslonik commented May 28, 2020 •

edited

Loading

stale bot commented Jul 27, 2020

stale bot commented Aug 8, 2020

isgursoy commented Mar 13, 2022

INVALID_ARGUMENT : Non-zero status code returned while running Conv node. Invalid input shape: {0,0} after successful convertation #4071

INVALID_ARGUMENT : Non-zero status code returned while running Conv node. Invalid input shape: {0,0} after successful convertation #4071

Comments

zeleniyslonik commented May 28, 2020 • edited Loading

stale bot commented Jul 27, 2020

stale bot commented Aug 8, 2020

isgursoy commented Mar 13, 2022

zeleniyslonik commented May 28, 2020 •

edited

Loading