-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition in ssyrk causing deadlock #1117
Comments
Probably similar to #1071 - you could try if "make USE_SIMPLE_THREADED_LEVEL3=1" works around it here as well. (If it does, the problem goes back to GotoBLAS2-1.08 or even earlier) |
I tried the |
can you attach gdb to deadlocked process and upload backtrace?
|
I noticed that wernsaar has a possible bugfix for the stride calculation in syrk_thread.c in his development fork, if you have time for experimenting you could try to cherrypick that. |
@brada4 I will get all the threads, however all of the threads had a similar stack track which I copied in the original post. https://gist.github.com/matthewfl/455d86676034fccce0dc2a41766602a2 @martin-frbg I will try @wernsaar 's branch out and see if that works |
@brada4 http://sprunge.us/ZMWK I also have been running wernsaar's patch overnight and it still hasn't crashed |
Thanks for checking, sounds like good news (assuming that you did that build without the USE_SIMPLE_THREADED_LEVEL3=1 workaround still in place :-) ) |
@martin-frbg so with the v0.2.19 release I tried that flag and it at least worked at least once, with wernsaar's code I did not use that flag and it hasn't crashed letting it run in a loop overnight. |
Thanks for the clarification. So really looking good (though maybe I should have a bad conscience for picking fruit in the neighbor's garden) |
Thank you for backtrace: |
@brada4 there is only one python thread here. The second python thread is python's GC background thread or ipython's auto complete, but these are never going to call blas. fyi: the arguments that I compiled openblas with were: |
Filtering redundant options (see Makefile.rule for defaults) |
Using ssyrk with multiple threads eventually causes openblas to deadlock and not return.
Sample stack traces:
https://gist.github.com/matthewfl/455d86676034fccce0dc2a41766602a2
This seems to be related to one of these lines, but I haven't narrowed it down further yet:
https://github.com/xianyi/OpenBLAS/blob/develop/driver/level3/level3_syrk_threaded.c#L361
https://github.com/xianyi/OpenBLAS/blob/develop/driver/level3/level3_syrk_threaded.c#L469
https://github.com/xianyi/OpenBLAS/blob/develop/driver/level3/level3_syrk_threaded.c#L267
This takes my program about ~5 hours to hit this race condition. I am guessing that it requires at least 10,000 calls of this method with ~300^2 matrix to cause it to deadlock.
This is also on the release v0.2.19
The text was updated successfully, but these errors were encountered: