We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug 不能稳定复现,应该是和当前docker实例具体环境有关系。
可以按照 microsoft/onnxruntime#8313 中提到的:
https://github.com/daquexian/onnx-simplifier/blob/master/onnxsim/onnx_simplifier.py#L188
sess_options.intra_op_num_threads = 1 sess_options.inter_op_num_threads = 1
进行修改。
Model 和模型无关,和环境相关,我的pip list如下:
absl-py 1.0.0 addict 2.4.0 appdirs 1.4.4 asn1crypto 1.2.0 astunparse 1.6.3 backcall 0.2.0 cachetools 4.2.4 certifi 2019.9.11 cffi 1.13.0 chardet 3.0.4 charset-normalizer 2.0.10 click 8.0.3 conda 4.7.12 conda-package-handling 1.6.0 contextlib2 21.6.0 cryptography 2.8 cycler 0.11.0 Cython 0.29.26 decorator 5.1.1 dnspython 2.1.0 filelock 3.4.2 flatbuffers 1.12 fonttools 4.28.5 fvcore 0.1.5.post20210924 gast 0.3.3 gevent 21.8.0 google-auth 2.3.3 google-auth-oauthlib 0.4.6 google-pasta 0.2.0 graphviz 0.8.4 greenlet 1.1.2 grpcio 1.43.0 gunicorn 20.1.0 h5py 2.10.0 hiddenlayer 0.3 idna 2.8 imageio 2.13.5 importlib-metadata 4.10.0 iopath 0.1.9 ipaddress 1.0.23 ipdb 0.13.9 ipython 7.31.0 jedi 0.18.1 joblib 1.1.0 Keras-Preprocessing 1.1.2 kiwisolver 1.3.2 Mako 1.1.6 Markdown 3.3.6 MarkupSafe 2.0.1 matplotlib 3.5.1 matplotlib-inline 0.1.3 ml-collections 0.1.0 mmcv 1.4.2 mxnet 1.8.0 mypy-protobuf 3.0.0 networkx 2.6.3 numpy 1.21.4 oauthlib 3.1.1 onnx 1.8.0 onnx-simplifier 0.3.6 onnxoptimizer 0.2.6 onnxruntime 1.10.0 opencv-python 4.5.5.62 opt-einsum 3.3.0 packaging 20.9 parso 0.8.3 pexpect 4.8.0 pickleshare 0.7.5 Pillow 8.0.1 pip 21.3.1 ply 3.11 portalocker 2.3.2 prompt-toolkit 3.0.24 protobuf 3.19.3 psutil 5.9.0 ptyprocess 0.7.0 pyasn1 0.4.8 pyasn1-modules 0.2.8 pycosat 0.6.3 pycparser 2.19 pycryptodome 3.9.8 pycuda 2021.1 Pygments 2.11.2 PyJWT 1.7.1 pyOpenSSL 19.0.0 pyparsing 3.0.6 PySocks 1.7.1 pytest-runner 5.3.1 python-dateutil 2.8.2 python-etcd 0.4.5 pytools 2021.2.9 PyWavelets 1.2.0 PyYAML 5.4.1 redis 3.5.3 regex 2021.11.10 requests 2.27.1 requests-oauthlib 1.3.0 rsa 4.8 ruamel_yaml 0.15.46 sacremoses 0.0.47 schedule 0.6.0 scikit-image 0.15.0 scipy 1.7.3 sentencepiece 0.1.91 setuptools 60.5.0 simplejson 3.17.6 six 1.16.0 tabulate 0.8.9 tensorboard 2.7.0 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.1 tensorflow-estimator 2.3.0 tensorflow-gpu 2.3.1 tensorrt 7.2.1.6 termcolor 1.1.0 terminaltables 3.1.10 tf2onnx 1.8.5 thriftpy2 0.4.14 timm 0.4.12 tokenizers 0.9.3 toml 0.10.2 torch 1.7.1+cu110 torchvision 0.8.2 tqdm 4.36.1 traitlets 5.1.1 transformers 3.5.1 types-futures 3.3.2 types-protobuf 3.19.0 typing_extensions 4.0.1 urllib3 1.26.8 wcwidth 0.2.5 Werkzeug 2.0.2 wheel 0.37.1 wrapt 1.13.3 yacs 0.1.8 yapf 0.29.0 zipp 3.7.0 zope.event 4.5.0 zope.interface 5.4.0
The text was updated successfully, but these errors were encountered:
考虑到forward只会调用一次,是否可以将上面比较hack的方法直接加入?这样可以避免像我一样因为环境配置产生问题
forward
Sorry, something went wrong.
今天又被这个坑了一次。。。求作者更新,onnxruntime下面的讨论: microsoft/onnxruntime#8313
好的,感谢!不好意思之前错过了这个 issue
QAQ 考虑加入 ONNX QQ 群或者微信群吗,可以加我的好友(QQ 和微信 ID 都是 daquexian)
你的容器环境是否是cpuset模式的呢?我看了一下源码,python3.6支持的版本中,从onnxruntime 1.7到1.10是能够稳定复现的,根因似乎是,做cpu亲和的代码在cpuset模式下并没有兼容,亲和逻辑是获取cpu核数,然后从0号cpu开始绑定亲和,但是容器cpuset的环境下,实际online的cpu号并非从0号开始,从而导致代码异常退出。
今天又被这个坑了一次。。。求作者更新,onnxruntime下面的讨论: microsoft/onnxruntime#8313 你的容器环境是否是cpuset模式的呢?我看了一下源码,python3.6支持的版本中,从onnxruntime 1.7到1.10是能够稳定复现的,根因似乎是,做cpu亲和的代码在cpuset模式下并没有兼容,亲和逻辑是获取cpu核数,然后从0号cpu开始绑定亲和,但是容器cpuset的环境下,实际online的cpu号并非从0号开始,从而导致代码异常退出。
感谢你的分析!等我看下后同步。
我会看一下这个问题,在 runtime error 时用上面的 fallback 方案
No branches or pull requests
Describe the bug
不能稳定复现,应该是和当前docker实例具体环境有关系。
可以按照 microsoft/onnxruntime#8313 中提到的:
https://github.com/daquexian/onnx-simplifier/blob/master/onnxsim/onnx_simplifier.py#L188
进行修改。
Model
和模型无关,和环境相关,我的pip list如下:
The text was updated successfully, but these errors were encountered: