Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenBLAS AVX2 errors in Docker Desktop #4576

Open
2 tasks done
1fish2 opened this issue May 16, 2020 · 3 comments
Open
2 tasks done

OpenBLAS AVX2 errors in Docker Desktop #4576

1fish2 opened this issue May 16, 2020 · 3 comments

Comments

@1fish2
Copy link

1fish2 commented May 16, 2020

  • I have tried with the latest version of my channel (Stable or Edge)
  • I have uploaded Diagnostics
  • Diagnostics ID: EB872B58-4E4C-4B98-8450-029E23A8ACF0/20200516012221

Expected behavior

OpenBLAS under Mac Docker Desktop would pass its self-tests like it does everywhere else.

Actual behavior

While building the Docker Container, OpenBLAS compiles then runs its math accuracy self-tests which print 25 errors starting with:

 ******* FATAL ERROR - COMPUTED RESULT IS LESS THAN HALF ACCURATE *******
           EXPECTED RESULT   COMPUTED RESULT
       1      0.319347E-03     -0.337344
       2      0.176986         -0.212464
       3      0.194371           1.19455
       4      0.946107          0.108973
       5     -0.444610E-01      0.160310
       6     -0.130439          0.287645
       7      0.123124          0.615582
       8     -0.265688          0.633582
       9     -0.244990         -0.197097
      10     -0.514870         -0.120181
      11      0.447784         -0.140769
      12     -0.209580E-01      0.142175
      13     -0.108383         -0.262506
      14      0.221397         -0.142646
      15     -0.185416         -0.113833
      16      0.538072          0.699690E-01
      17      0.245747          0.245747
      18      0.814760E-01      0.814760E-01
      19     -0.215679         -0.215679
      20      0.408573E-01      0.408573E-01
      21     -0.196518         -0.196518
      22      0.825976          0.825976
      23      0.597224          0.597224
      24      0.218731          0.218731
      25      -1.25434          -1.25434
      26      0.128572          0.128572
      27     -0.178624         -0.178624
      28      0.897403          0.897403
      29      0.465717          0.465717
      30      0.120119          0.120119
      31      0.342683          0.342683
 ******* SSYMM  FAILED ON CALL NUMBER:
   1120: SSYMM ('L','U', 31,  1, 1.0, A, 32, B, 32, 0.0, C, 32)    .

and ending with this one:

 ******* FATAL ERROR - COMPUTED RESULT IS LESS THAN HALF ACCURATE *******
                       EXPECTED RESULT                    COMPUTED RESULT
       1  (    1.13123    ,  -0.239697    )  (  -0.704381    ,  -0.298337    )
      THESE ARE THE RESULTS FOR COLUMN   1
 ******* cblas_zhemm  FAILED ON CALL NUMBER:
    490: cblas_zhemm ( CblasColMajor,    CblasRight,    CblasUpper,
            1, 35, ( 1.0, 0.0), A, 36, B,  2, ( 0.0, 0.0), C,  2).
 ******* cblas_zhemm  FAILED ON CALL NUMBER:
    289: cblas_zhemm ( CblasRowMajor,     CblasLeft,    CblasUpper,
            1,  1, ( 0.0, 0.0), A,  2, B,  2, ( 0.0, 0.0), C,  2).

 ******* FATAL ERROR - TESTS ABANDONED *******

Information

  • Is it reproducible? Yes.
  • Is the problem new? No. The problem was first observed with OpenBLAS 0.3.6 which started using AVX2 instructions due to a fix in the CPU feature detection code.
  • Docker Desktop version: 2.3.0.2
  • macOS Version: 10.14.6 Mojave on MacBook Pro 2018 with 2.9 GHz Intel Core i9
  • This problem does not occur in Docker on Linux.
  • See 18 "FATAL ERROR" messages compiling v0.3.7 in a Dockerfile OpenMathLib/OpenBLAS#2244
  • See AVX2 bug in Docker Desktop on macOS machyve/xhyve#171 which the OpenBLAS team figured is the most likely component containing the bug.
  • Hypothesis: Within Docker Desktop on macOS, the virtualization layer is not properly saving and restoring AVX2 registers, causing the OpenBLAS math code to produce incorrect results and thus fail its math accuracy self-tests. If this hypothesis is correct, video coders and other math apps should fail same way within Docker Desktop on Mac.

Diagnostic logs

Docker Desktop does not seem to notice that anything went wrong. OpenBLAS's self-tests did.

Steps to reproduce the behavior

Dockerfile:

FROM python:2.7.16

# If you build this Dockerfile with the option `--build-arg NO_AVX2=1`,
# OpenBLAS won't use the AVX2 hardware.
ARG NO_AVX2=0

RUN apt-get update \
    && apt-get install -y swig gfortran llvm cmake

RUN (mkdir -p openblas && cd openblas \
    && curl -SL https://github.com/xianyi/OpenBLAS/archive/v0.3.9.tar.gz | tar -xz \
    && cd OpenBLAS* \
    && make "NO_AVX2=${NO_AVX2}" FC=gfortran \
    && make "NO_AVX2=${NO_AVX2}" PREFIX=/usr install) \
    && rm -r openblas
  1. docker build -t issue2244 .

Workaround [although this OpenBLAS will run 20-30% slower]:

  1. docker build --build-arg NO_AVX2=1 -t issue2244 .
1fish2 added a commit to CovertLab/wcEcoli that referenced this issue May 18, 2020
In case the bug fixes matter, update OpenBLAS to 0.3.9 inside the Docker container and in the create-pyenv instructions.

I retested the problem with AVX2 instructions in Docker Desktop for Mac (OpenMathLib/OpenBLAS#2244) with the latest OpenBLAS and Docker Desktop and filed the Issue in the Docker repo this time, docker/for-mac#4576

On macOS outside of Docker, `brew install openblas` now installs 0.3.9 . We no longer need to compile it from source.

Use Python 3.8.3 in the test workflows.
@docker-robott
Copy link
Collaborator

Issues go stale after 90 days of inactivity.
Mark the issue as fresh with /remove-lifecycle stale comment.
Stale issues will be closed after an additional 30 days of inactivity.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle stale

@1fish2
Copy link
Author

1fish2 commented Aug 14, 2020

/lifecycle frozen

This bug is still present in Docker Desktop for Mac 2.3.0.4 (46911).

@1fish2
Copy link
Author

1fish2 commented Feb 9, 2021

/remove-lifecycle stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants