-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect results on s390x (-march=zEC12 -mtune=z13, ZARCH_GENERIC) #1743
Comments
Could you, please, attach linked build log before it disappears? Looks like same TLS problem being approached at #1742 |
I am not yet convinced that the test failures are caused by the new thread-local storage allocator, but could you try current "develop" branch where I just merged the aforementioned PR #1742 (that reverts to the old allocation code unless OpenBLAS is built with -DUSE_TLS) ? |
It's from a real hw (z13), but zEC12 gives same error messages. It looks like as a "feature" of the generic kernel. When the z13 kernel is used, there are no such errors (https://koji.fedoraproject.org/koji/buildinfo?buildID=1133326 built the z13 kernel by mistake). Fedora needs to stick to the generic kernel as we support running the distro on zEC12 or newer hw. |
Thanks - as the generic kernel is pure C it may be possible to reproduce this on more mundane hardware. |
No issue building on x86_64 with the KERNEL.ZARCH_GENERIC in place of its usual KERNEL.generic file, so not that simple unfortunately. Did earlier Fedora builds all use z13, or is this a recent failure with the generic kernel (which would indeed implicate the new memory.c as the most likely recent change) ? |
AFAIK this is a long time issue |
So the issue has indeed been around for a while, this is a 0.2.20 build log from January when the builders were still z13. https://kojipkgs.fedoraproject.org//packages/openblas/0.2.20/4.fc28/data/logs/s390x/build.log |
Interesting, thanks. As far as I can tell, much the same setup is used for GEMM/TRMM on ARMV8, but with USE_TRMM=1 defined in kernel/Makefile.L3 (This is also set when CORE is Z13, might be worthwile to add it for |
with
I see no more those "half accurate" errors |
I maintain the fflas-ffpack package for the Fedora Linux distribution. There is currently a push in Fedora to migrate from atlas and the reference blas implementation to openblas. However, the fflas-ffpack test suite failed on s390x when built with openblas. The issue is tracked here: https://bugzilla.redhat.com/show_bug.cgi?id=1619074.
I found that the openblas test suite itself reported multiple failures when built on s390x, but did not return a nonzero exit code; the issue was therefore overlooked as the openblas build did not fail. The openblas test failures can be seen here: https://kojipkgs.fedoraproject.org//packages/openblas/0.3.2/3.fc29/data/logs/s390x/build.log. Here is the first error in the logs:
There are over 100 such failures. I have only looked at the first dozen or so, but in each case the computed result appears to be exactly double the expected result. Fedora packages built for s390x are built with gcc -march=zEC12 -mtune=z13. The openblas package selects TARGET=ZARCH_GENERIC.
The text was updated successfully, but these errors were encountered: