-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenBLAS hangs in triangular matrix multiplication on Excavator #841
Comments
I only has a steamroller machine. I think it can ran excavator codes. |
Two observations: with the build of OpenBLAS shipped with Julia, the following small Fortran program that hangs on an Excavator machine program test
double precision :: a(2,2), b(2,1)
a(1,1) = 1
a(1,2) = 2
a(2,2) = 3
b(1,1) = 1
b(2,1) = 1
call dtrmm64_('L', 'U', 'N', 'N', 2_8, 1_8, 1.0d0, a, 2_8, b, 2_8)
write(*,*) 'done'
end program when executed with the default coretype. If I set Next, I tried to build OpenBLAS from source with
Update: The test program is written for Julia's special OpenBLAS build with symbol renaming and 64 bit integers but when I try to link with the freshly built OpenBLAS I've, of course, adjusted the symbol names and integer sizes. |
At the risk of stating the obvious, it looks like your test program needs -lpthread to link. |
Again based on my still limited understanding of the codebase, would it be worth trying to force OPENBLAS_CORETYPE to "Steamroller", the generation directly preceding the Excavator cpu, in the hope of finding a more modern "working" baseline than Prescott/Piledriver ? |
|
Sure. But knowing that Steamroller kernel works should help to track down the problem, seeing that there seem to be much fewer differences in configuration between these two than between Excavator and Piledriver. My next step if I had the hardware would be to simply copy KERNEL.STEAMROLLER over KERNEL.EXCAVATOR and add "|| defined(EXCAVATOR)" to those optimized function codes in kernel/x86_64 that already have "#if ... defined (STEAMROLLER)" - about 20 of them, but a small change in each. |
Hi, I have built latest OpenBLAS develop on our STEAMROLLER machine. Then I modified Makefile.rule The Current File KERNEL.EXCAVATOR is identical to KERNEL.PILEDRIVER All tests succeded Then I copied KERNEL.STEAMROLLER to KERNEL.EXCAVATOR All tests succeded The best configuration is, to copy KERNEL.STEAMROLLER to KERNEL.EXCAVATOR But as Martin says, there are some files ( #if ... defined(STEAMROLLER) I recommand, do not use gcc 5.2.x. Best regards On 04/23/2016 03:06 PM, Martin Kroeker wrote:
|
The program doesn't hang if I build with |
I do not see much difference between DYNAMIC_ARCH with and without OPENBLAS_CORETYPE set - in either case, gotoblas_dynamic_init() from driver/others/dynamic.c will try to read the environment variable first. Only when it is not set, get_coretype() is called (which from what you posted above apparently identifies the Excavator cpu correctly). So would your test code hang as well once the loop in force_coretype() is fixed to return the correct number "22" for Excavator instead of falling back to Prescott ? |
Yes. I've extended the loop but the error is still there for |
No, I simpy built for TARGET=EXCAVATOR On 04/25/2016 04:28 AM, Andreas Noack wrote:
|
Could you get a backtrace at the time of the hang (perhaps something along the lines of #716, though it does not segfault in your case) ? Or just try running single-threaded first (which I suspect will work fine even with DYNAMIC_ARCH=1). Somewhat unrelated, glancing through the changes made to fix #716 by ensuring that getenv is called not more than once (which would appear to still be violated in the dynamic case ?), I notice that driver/others/parameter.c has a few cases where an #ifdef with a long list of cpu models does contain STEAMROLLER but not EXCAVATOR, but this probably only affects the efficiency with which cpu cache sizes are determined. |
Hi I have solved the problem. I will push the chances soon. Best regards On 04/25/2016 09:10 AM, Martin Kroeker wrote:
|
Hi, I pushed the bug fixes and enhancements for EXCAVATOR. Please checkout latest develop branch and test on a real EXCAVATOR machine. Best regards |
Great. I've just tried it and I can confirm that it fixed the issue. Thanks. |
I'm sorry for the incomplete bug report but I don't have access to an Excavator machine. The issue was reported in https://groups.google.com/forum/#!forum/julia-stats and forcing OpenBLAS to use Piledriver kernels fixed the issue.
If you have an Excavator machine I can get access to, I can offer to prepare a better bug report.
The text was updated successfully, but these errors were encountered: