Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault on import of openBLAS-based numpy #60

Closed
knedlsepp opened this issue Sep 1, 2017 · 10 comments
Closed

Segmentation fault on import of openBLAS-based numpy #60

knedlsepp opened this issue Sep 1, 2017 · 10 comments

Comments

@knedlsepp
Copy link

knedlsepp commented Sep 1, 2017

I'm currently having problems with numpy.
This only happens now that I have updated numpy to a version that uses openblas. Trying to import numpy leads to a Segmentation fault in about 1 out of 10 occasions.

#!/usr/bin/env python
import numpy as np
print "Hello world"

The numpy version is: numpy-1.13.1, py27_blas_openblas_200. Sometimes it outputs something along those lines, but most of the time it just crashes.

OpenBLAS blas_thread_init: pthread_create: Cannot allocate memory
OpenBLAS blas_thread_init: RLIMIT_NPROC 257835 current, 257835 max

Here is are stack traces of two different crashes:

#0  0x00007fb0c737102b in pthread_join () from /lib64/libpthread.so.0
#1  0x00007fb0c3ea5787 in blas_thread_shutdown_ () from /home/jkemet/miniconda/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0
#2  0x00007fb0c3ea482b in blas_shutdown () from /home/jkemet/miniconda/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0
#3  0x00007fb0c3c86011 in gotoblas_quit () from /home/jkemet/miniconda/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0
#4  0x00007fb0c79b32f6 in _dl_fini () from /lib64/ld-linux-x86-64.so.2
#5  0x00007fb0c69b98b1 in __run_exit_handlers () from /lib64/libc.so.6
#6  0x00007fb0c69b9935 in exit () from /lib64/libc.so.6
#7  0x00007fb0c69a3034 in __libc_start_main () from /lib64/libc.so.6
#8  0x00000000004007b1 in _start ()
#0  0x00007f55de72c9ec in pthread_create@@GLIBC_2.2.5 () from /lib64/libpthread.so.0
#1  0x00007f55db2610b8 in blas_thread_init () from /home/jkemet/miniconda/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0
#2  0x00007f55db042085 in gotoblas_init () from /home/jkemet/miniconda/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0
#3  0x00007f55ded6ed8e in call_init () from /lib64/ld-linux-x86-64.so.2
#4  0x00007f55ded6ee76 in _dl_init_internal () from /lib64/ld-linux-x86-64.so.2
#5  0x00007f55ded72e5e in dl_open_worker () from /lib64/ld-linux-x86-64.so.2
#6  0x00007f55ded6ebb6 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#7  0x00007f55ded727ba in _dl_open () from /lib64/ld-linux-x86-64.so.2
#8  0x00007f55de520f26 in dlopen_doit () from /lib64/libdl.so.2
#9  0x00007f55ded6ebb6 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#10 0x00007f55de5214cf in _dlerror_run () from /lib64/libdl.so.2
#11 0x00007f55de520fc1 in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2
#12 0x00007f55dea8c0ae in _PyImport_GetDynLoadFunc (fqname=fqname@entry=0x717980 "numpy.core.multiarray", shortname=shortname@entry=0x71798b "multiarray", 
...
...

Setting OMP_NUM_THREADS=1 seems to get rid of the problem, but is a pretty ugly workaround.
Related issue: OpenMathLib/OpenBLAS#888

@knedlsepp knedlsepp changed the title Segmentation fault on import of numpy.multiarray Segmentation fault on import of openBLAS-based numpy Sep 1, 2017
@rgommers
Copy link
Contributor

rgommers commented Sep 2, 2017

Thanks for the report @knedlsepp. That looks pretty bad. Hopefully can get some movement in the OpenBLAS issue, and then we can rebuild numpy.

This must be fairly system-specific, otherwise we would be getting loads more bug reports.

@jakirkham
Copy link
Member

Could you please share the details of your environment as well, @knedlsepp?

@knedlsepp
Copy link
Author

knedlsepp commented Nov 16, 2017

My environment is a clean miniconda + numpy from conda-forge environment on openSUSE11.
After some conversation with one of the openBLAS maintainers he fixed some (hopefully all) of the problems uncovered via: OpenMathLib/OpenBLAS#1299
I guess we'll see in the next release. In the meantime I work around it by using a non-openBLAS numpy.

@jakirkham
Copy link
Member

So am asking about the environment as this isn't something I have encountered on Linux and would like to see if I can reproduce it. Exact environment details would be very helpful on this regard. Also are you setting OMP_NUM_THREADS to something other than 1 normally?

As far as the fix goes, we could always patch openblas if you want to give it a try. Doesn't look like that breaks anything API/ABI-wise. So it should be pretty easy to try. If things worsen, we can always pull the package with that patch.

@ocefpaf
Copy link
Member

ocefpaf commented Feb 21, 2018

Closing due to lack of activity. @knedlsepp please re-open if you are still experiencing this. I could not reproduce that with our latest numpy BTW.

@ocefpaf ocefpaf closed this as completed Feb 21, 2018
@knedlsepp
Copy link
Author

@ocefpaf I'm actually still experiencing this on my opensuse11 (2.6.37.6-0.5-desktop) machines. I'm currently mitigating this by using the intel-mkl based builds of numpy. I'm still hoping for the next openBLAS release to fix this, as the commits are already in master. Sadly I have no idea how to locally build and test with a pre-release openBLAS.

@ocefpaf ocefpaf reopened this Feb 21, 2018
@ibebio
Copy link

ibebio commented Mar 14, 2018

Same problem here when using Ubuntu 16.04.4 LTS on AMD nodes in a Sun Grid Engine 6.2u5. import numpy reliably causes a segfault if the openBLAS variant of numpy is used. Tested under a fresh anaconda environment with python 3.6 and 2.7.

@jakirkham
Copy link
Member

Thanks for the info.

Sounds like there was a patch upstream that might help. ( OpenMathLib/OpenBLAS#1299 ) If someone would like to backport that patch and make a PR here to include the patch, think that would be a viable path forward short term.

@jakirkham
Copy link
Member

Raised issue ( conda-forge/openblas-feedstock#42 ) on patching openblas. If anyone has time/interest, please feel free to submit a PR over there. If you think we've missed anything, please feel free to let us know in that issue.

ocefpaf pushed a commit that referenced this issue Aug 31, 2018
@isuruf
Copy link
Member

isuruf commented Nov 5, 2020

This is too old now. Please open a new issue if this is still an issue.

@isuruf isuruf closed this as completed Nov 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants