Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: ICC compiler package name change #1566

Merged
merged 1 commit into from
Dec 10, 2020
Merged

Conversation

rscohn2
Copy link
Contributor

@rscohn2 rscohn2 commented Dec 10, 2020

No description provided.

@ax3l ax3l self-requested a review December 10, 2020 21:58
@ax3l ax3l self-assigned this Dec 10, 2020
@ax3l ax3l added the backend: sycl Specific to DPC++/SYCL execution (CPUs/GPUs) label Dec 10, 2020
Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the update @rscohn2!

@ax3l ax3l merged commit a7ba409 into ECP-WarpX:development Dec 10, 2020
@ax3l ax3l added the component: tests Tests and CI label Dec 10, 2020
@ax3l ax3l changed the title compiler package name change CI: ICC compiler package name change Dec 10, 2020
@ax3l
Copy link
Member

ax3l commented Feb 25, 2021

@rscohn2 looking at the latest GitHub action runs, it looks like the high-level DPC++ packages we currently use are getting too large to run on public CI.
We reach "no space left on device" errors since a few days quite regularly.

Do you recommend to select potentially a smaller sub-package in oneAPI that pulls less dependencies?
These are the two packages we currently install:
https://github.com/ECP-WarpX/WarpX/blob/development/.github/workflows/dependencies/dpcpp.sh

  • intel-oneapi-dpcpp-cpp-compiler
  • intel-oneapi-mkl-devel

Thank you for your help!
Axel

ax3l added a commit that referenced this pull request Feb 25, 2021
```
/usr/bin/ld: final link failed: No space left on device
LLVM ERROR: IO failure on output stream: No space left on device
```

#1566 (comment)
@ax3l ax3l mentioned this pull request Feb 25, 2021
@rscohn2
Copy link
Contributor Author

rscohn2 commented Feb 26, 2021

I don't think there are smaller packages. @mmzakhar: Any ideas?

You are probably only using a few libraries from:
/opt/intel/oneapi/mkl/latest/lib/intel64
Could you delete the ones you don't use?

@ax3l
Copy link
Member

ax3l commented Feb 26, 2021

Thank you @rscohn2! I just deleted intel-oneapi-mkl-devel, which is also huge (~1GB IIRC) #1743

But it turns out that MKL package is the only one shipping oneapi/mkl/rng/device.hpp (+ libs), which we need for random number generation.

Potentially, creating more sub-packages that provide individual compilers and sub-aspects of MKL could be a way to reduce the binary size.

You are probably only using a few libraries from:
/opt/intel/oneapi/mkl/latest/lib/intel64
Could you delete the ones you don't use?

You mean to apt install and immediately remove a couple of libraries in the intel install path?

In case it is helpful, Nvidia's CUDA apt repo provides similarly large packages for their toolkit as oneAPI does. With them, their facade packages mainly declare dependencies on very fine-grained dependencies that one can also select manually. I found this super helpful when building containers and pulling dependencies in resource-constrained environments, such as continuous integration:
https://github.com/ComputationalRadiationPhysics/picongpu/blob/0.5.0/share/picongpu/dockerfiles/ubuntu-1604/Dockerfile#L25-L32

For AMD, we also pull libraries like rocFFT manually to keep the install footprint small:

rocm-dev \
rocfft \

@rscohn2
Copy link
Contributor Author

rscohn2 commented Feb 26, 2021

You mean to apt install and immediately remove a couple of libraries in the intel install path?

Yes. I think it is the only short term solution.

I will share a link to this topic with the package people. The assumption that installs are done once and go to very large disks no longer holds with virtualization/containerization.

@mmzakhar
Copy link

mmzakhar commented Feb 26, 2021

Here is a workaround for sporadic "no space left on device" issue:

  1. Conditionally install intel-oneapi-mkl-devel package if there isn't a CI cache hit.
  2. Remove unnecessary files after the install.
  3. Cache the install directory.

With this, the pipeline needs to success once, and the required files will be restored from cache in subsequent runs, skipping the large install. The cache can be auto-updated when MKL changes in APT repo.

An example of such usage of CI cache can be found in https://github.com/oneapi-src/oneapi-ci/blob/master/.github/workflows/build_all.yml#L239

@ax3l
Copy link
Member

ax3l commented Mar 4, 2021

@rscohn2 @mmzakhar
Thanks, I'll try to remove unnecessary files after install in #1743. Do you have hints on the package install layout and what is structurally safe to explore removing without breaking internal dependencies or activation scripts?

The caching is a good hint, yet I think an orthogonal step to speed up our CI install (but otherwise faces the same challenges in terms of temporary size).

@rscohn2
Copy link
Contributor Author

rscohn2 commented Mar 4, 2021

I could not find the link line in your logs to see what you use. I was expecting to see -lmkl_core and similar.

If you are not using the SYCL interfaces to MKL, then these are a good candidate:
350M /opt/intel/oneapi/mkl/latest/lib/intel64/libmkl_sycl.a
0 /opt/intel/oneapi/mkl/latest/lib/intel64/libmkl_sycl.so
617M /opt/intel/oneapi/mkl/latest/lib/intel64/libmkl_sycl.so.1

And/or removing either *.a or *.so in that directory, depending on which you use.

@ax3l
Copy link
Member

ax3l commented Mar 8, 2021

Thank you for the hint! Yes, the link line is not shown since it aborts before that step.

I tried this in #1743 and get close - now we only run out of disk space in the linking step itself.

Is there anything else we can remove in the compiler or MKL package?

@rscohn2
Copy link
Contributor Author

rscohn2 commented Mar 8, 2021

Maybe these FPGA files:
1.4G /opt/intel/oneapi/compiler/latest/linux/lib/oclfpga
252M /opt/intel/oneapi/compiler/latest/linux/lib/emu

@ax3l
Copy link
Member

ax3l commented Mar 8, 2021

Oh nice, those are great chunks to slice off and now brought us over the edge. Thank you! 🎉

@ax3l
Copy link
Member

ax3l commented Mar 11, 2021

Oh no, this is still too large and hits us again now.
Have to disable again until we find more we can remove. @rscohn2 do you have more suggestions? Or maybe just a new release rolled out and one of the above paths changed?

@ax3l ax3l mentioned this pull request Mar 11, 2021
1 task
@rscohn2
Copy link
Contributor Author

rscohn2 commented Mar 11, 2021

I did a PR against the branch for #1783 to remove mkl .a. It removes 1 gig, but still waiting for it to finish.

We have not released anything new, and I verified that your deletes work as expected. Maybe not all github VM's are same size and you got lucky before.

@ax3l
Copy link
Member

ax3l commented Mar 12, 2021

Oh, great idea! Thank you, that works! 🎉

We have not released anything new, and I verified that your deletes work as expected. Maybe not all github VM's are same size and you got lucky before.

Thanks for the info! Yes, I think we were pretty close and small fluctuations caused this.

@rscohn2
Copy link
Contributor Author

rscohn2 commented Mar 25, 2021

@ax3l: You can free up 1-2 GB by cleaning packages from the apt cache: https://github.com/oneapi-src/oneapi-ci/pull/43/files

@ax3l ax3l mentioned this pull request Mar 26, 2021
@ax3l
Copy link
Member

ax3l commented Mar 26, 2021

Thank you for the hint! Added in #1841

@ax3l
Copy link
Member

ax3l commented Mar 29, 2021

@rscohn2 I noted that since about a day or two and still ongoing today, CI jobs crash on apt with messages of the form:

Failed to fetch https://apt.repos.intel.com/oneapi/dists/all/main/binary-all/Packages.bz2  File has unexpected size (10383 != 6571). Mirror sync in progress? [IP: 104.127.244.160 443]
   Hashes of expected file:
    - Filesize:6571 [weak]
    - SHA512:16f21c4f8c2b6ce59434685a5f0598a8a5328f321528e565ab0bba9c773d67011a27922832205e8303857520dd7678c17d7ea4fced3efcee432701c2a33404ae
    - SHA256:b3204b9762e33c522c5a2c50160cd87227eee8607c8efcab76557324e7678eb7
    - SHA1:050ca40f7cf212295f82df347bf9ad253024b8d5 [weak]
    - MD5Sum:e45e36ac6473d85cce567d5f1bf9cdc8 [weak]
   Release file created at: Wed, 24 Mar 2021 09:16:21 +0000
E: Failed to fetch https://apt.repos.intel.com/oneapi/dists/all/main/binary-amd64/Packages.bz2  
E: Some index files failed to download. They have been ignored, or old ones used instead.
Error: Process completed with exit code 100.

Is there potentially a oneAPI release in process and the CDN servers are not updated in a single transaction? It's certainly a bit disruptive on 4 of the repos that I co-maintain, do you have a hint how we can make this more robust? :)

@rscohn2
Copy link
Contributor Author

rscohn2 commented Mar 29, 2021 via email

@ax3l
Copy link
Member

ax3l commented Jun 30, 2021

@rscohn2 With the latest DPC++ release that rolled out yesterday for the apt packages (2021.3.0), it looks like compile time increased significantly: Our DPC++ build takes now >5hrs on CI, which has taken 34min before (ICC and ICX not affected).

I wonder if that's a DPC++ compiler performance regression or some oversubscription of resources, e.g. more compile parallelism in the last compiler release? We compile with -j 2 according to available GitHub action resources.

@rscohn2
Copy link
Contributor Author

rscohn2 commented Jun 30, 2021

They would notice if there was a broad regression so it is probably something that needs a specific source to trigger. If you can provide a source file (preprocessed with -E) that shows the 20x slowdown I can report it. Here is a suggestion for determining which file is slow. https://stackoverflow.com/a/5973540/2525421

@ax3l
Copy link
Member

ax3l commented Jun 30, 2021

@rscohn2, thank you.
Tried it now also locally on my Ubuntu 20.04 (x86_64) and this reproducibility hangs on our development branch (f428f5a). The first translation unit is from WarpX.cpp that does not return after 10s of minutes in the build step.

$ cmake -S . -B build -DWarpX_COMPUTE=SYCL -DCMAKE_CXX_COMPILER=dpcpp -DCCACHE_PROGRAM=NOTFOUND -DCMAKE_VERBOSE_MAKEFILE=ON -DWarpX_MPI=OFF -DWarpX_QED=OFF
...
# all ok

$ cmake --build build
...
[ 52%] Building CXX object CMakeFiles/WarpX.dir/Source/WarpX.cpp.o
/opt/intel/oneapi/compiler/2021.3.0/linux/bin/dpcpp -DWARPX_DIM_3D -DWARPX_GIT_VERSION=\"21.06-35-gf428f5a26f11\" -DWARPX_PARSER_DEPTH=24 -I/home/axel/src/warpx/Source -I/home/axel/src/warpx/build/_deps/fetchedamrex-src/Tools/C_scripts -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/Base -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/Base/Parser -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/Boundary -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/AmrCore -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/LinearSolvers/MLMG -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/LinearSolvers/Projections -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/Particle -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-build -O3 -g -DNDEBUG -pthread -Wno-error=sycl-strict -Wno-pass-failed -fsycl -fsycl-device-code-split=per_kernel -mlong-double-64 -Xclang -mlong-double-64 -fno-sycl-early-optimizations -MD -MT CMakeFiles/WarpX.dir/Source/WarpX.cpp.o -MF CMakeFiles/WarpX.dir/Source/WarpX.cpp.o.d -o CMakeFiles/WarpX.dir/Source/WarpX.cpp.o -c /home/axel/src/warpx/Source/WarpX.cpp

It looks like this already hangs in the preprocessor. When I try to reduce this line to a -E run, it also hangs:

/opt/intel/oneapi/compiler/2021.3.0/linux/bin/dpcpp -E -DWARPX_DIM_3D -DWARPX_GIT_VERSION=\"21.06-35-gf428f5a26f11\" -DWARPX_PARSER_DEPTH=24 -I/home/axel/src/warpx/Source -I/home/axel/src/warpx/build/_deps/fetchedamrex-src/Tools/C_scripts -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/Base -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/Base/Parser -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/Boundary -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/AmrCore -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/LinearSolvers/MLMG -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/LinearSolvers/Projections -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/Particle -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-build -O3 -g -DNDEBUG -pthread -Wno-error=sycl-strict -Wno-pass-failed -fsycl -fsycl-device-code-split=per_kernel -mlong-double-64 -Xclang -mlong-double-64 -fno-sycl-early-optimizations -c /home/axel/src/warpx/Source/WarpX.cpp

According to my process manager, the process that hangs (full CPU load, regular memory usage) is:

/opt/intel/oneapi/compiler/2021.3.0/linux/bin/clang++ -cc1 -triple spir64-unknown-unknown-sycldevice -aux-triple x86_64-unknown-linux-gnu -fsycl-is-device -fdeclare-spirv-builtins -mllvm -sycl-opt -Wno-sycl-strict -fsycl-int-header=/tmp/WarpX-header-613abb.h -sycl-std=2020 -fsycl-unnamed-lambda -Wspir-compat -fsyntax-only -disable-free -disable-llvm-verifier -discard-value-names -main-file-name WarpX.cpp -mrelocation-model static -fveclib=SVML -mframe-pointer=all -menable-no-infs -menable-no-nans -menable-unsafe-fp-math -fno-signed-zeros -mreassociate -freciprocal-math -fdenormal-fp-math=preserve-sign,preserve-sign -ffp-contract=fast -fno-rounding-math -ffast-math -ffinite-math-only -fno-verbose-asm -mconstructor-aliases -aux-target-cpu x86-64 -debug-info-kind=limited -dwarf-version=4 -debugger-tuning=gdb -fcoverage-compilation-dir=/home/axel/src/warpx -resource-dir /opt/intel/oneapi/compiler/2021.3.0/linux/lib/clang/13.0.0 -internal-isystem /opt/intel/oneapi/compiler/2021.3.0/linux/bin/../include/sycl -internal-isystem /opt/intel/oneapi/compiler/2021.3.0/linux/bin/../include -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/Base -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/Base/Parser -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/Boundary -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/AmrCore -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/LinearSolvers/MLMG -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/LinearSolvers/Projections -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-src/Src/Particle -isystem /home/axel/src/warpx/build/_deps/fetchedamrex-build -D WARPX_DIM_3D -D WARPX_GIT_VERSION="21.06-35-gf428f5a26f11" -D WARPX_PARSER_DEPTH=24 -I /home/axel/src/warpx/Source -I /home/axel/src/warpx/build/_deps/fetchedamrex-src/Tools/C_scripts -D NDEBUG -I/opt/intel/oneapi/tbb/2021.3.0/env/../include -I/opt/intel/oneapi/mkl/2021.3.0/include -I/opt/intel/oneapi/dpl/2021.4.0/linux/include -I/opt/intel/oneapi/dev-utilities/2021.3.0/include -I/opt/intel/oneapi/compiler/2021.3.0/linux/include -internal-isystem /opt/intel/oneapi/compiler/2021.3.0/linux/bin/../compiler/include -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/10/../../../../include/x86_64-linux-gnu/c++/10 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/backward -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/10/../../../../include/x86_64-linux-gnu/c++/10 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/backward -internal-isystem /opt/intel/oneapi/compiler/2021.3.0/linux/lib/clang/13.0.0/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/10/../../../../x86_64-linux-gnu/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /opt/intel/oneapi/compiler/2021.3.0/linux/lib/clang/13.0.0/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/10/../../../../x86_64-linux-gnu/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -O3 -std=c++17 -fdeprecated-macro -fdebug-compilation-dir=/home/axel/src/warpx -ferror-limit 19 -fheinous-gnu-extensions -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions -mllvm -enable-gvn-hoist -fcolor-diagnostics -D__GCC_HAVE_DWARF2_CFI_ASM=1 -mllvm -disable-hir-generate-mkl-call -x c++ /home/axel/src/warpx/Source/WarpX.cpp

sudo strace -p <pid> and sudo strace -s 99 -ffp <pid> give no output.

I attached gdb to the process now and receive the following backtrace:

(gdb) attach 21530
Attaching to process 21530
Reading symbols from /opt/intel/oneapi/compiler/2021.3.0/linux/bin/clang++...
(No debugging symbols found in /opt/intel/oneapi/compiler/2021.3.0/linux/bin/clang++)
Reading symbols from /lib/x86_64-linux-gnu/librt.so.1...
Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/librt-2.31.so...
Reading symbols from /lib/x86_64-linux-gnu/libdl.so.2...
Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libdl-2.31.so...
Reading symbols from /opt/intel/oneapi/compiler/2021.3.0/linux/compiler/lib/intel64_lin/libimf.so...
(No debugging symbols found in /opt/intel/oneapi/compiler/2021.3.0/linux/compiler/lib/intel64_lin/libimf.so)
Reading symbols from /lib/x86_64-linux-gnu/libm.so.6...
Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libm-2.31.so...
Reading symbols from /opt/intel/oneapi/compiler/2021.3.0/linux/lib/oclfpga/host/linux64/lib/libz.so.1...
Reading symbols from /lib/x86_64-linux-gnu/libgcc_s.so.1...
(No debugging symbols found in /lib/x86_64-linux-gnu/libgcc_s.so.1)
Reading symbols from /lib/x86_64-linux-gnu/libpthread.so.0...
Reading symbols from /usr/lib/debug/.build-id/e5/4761f7b554d0fcc1562959665d93dffbebdaf0.debug...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
--Type <RET> for more, q to quit, c to continue without paging--
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...
Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libc-2.31.so...
Reading symbols from /lib64/ld-linux-x86-64.so.2...
Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/ld-2.31.so...
Reading symbols from /opt/intel/oneapi/compiler/2021.3.0/linux/compiler/lib/intel64_lin/libintlc.so.5...
(No debugging symbols found in /opt/intel/oneapi/compiler/2021.3.0/linux/compiler/lib/intel64_lin/libintlc.so.5)
0x0000561dfc6c2f62 in collectSYCLAttributes(clang::Sema&, clang::FunctionDecl*, clang::FunctionDecl const*, llvm::SmallVectorImpl<clang::Attr*>&, clang::Expr const*, bool) ()
(gdb) bt
#0  0x0000561dfc6c2f62 in collectSYCLAttributes(clang::Sema&, clang::FunctionDecl*, clang::FunctionDecl const*, llvm::SmallVectorImpl<clang::Attr*>&, clang::Expr const*, bool) ()
#1  0x0000561dfc72354f in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#2  0x0000561dfc7236ab in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#3  0x0000561dfc723621 in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#4  0x0000561dfc72360c in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#5  0x0000561dfc7235cd in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#6  0x0000561dfc723660 in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#7  0x0000561dfc723636 in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#8  0x0000561dfc7235f7 in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#9  0x0000561dfc7235cd in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#10 0x0000561dfc7236ab in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#11 0x0000561dfc723636 in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#12 0x0000561dfc7235f7 in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#13 0x0000561dfc7235cd in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#14 0x0000561dfc7235cd in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#15 0x0000561dfc7235cd in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#16 0x0000561dfc7235cd in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#17 0x0000561dfc7235cd in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#18 0x0000561dfc7235cd in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#19 0x0000561dfc7235cd in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#20 0x0000561dfc7235cd in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#21 0x0000561dfc7235cd in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#22 0x0000561dfc7235cd in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#23 0x0000561dfc7235cd in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#24 0x0000561dfc7235cd in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#25 0x0000561dfc7235cd in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#26 0x0000561dfc7235cd in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#27 0x0000561dfc7236ab in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#28 0x0000561dfc7236ab in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#29 0x0000561dfc7236ab in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#30 0x0000561dfc6c6fb2 in SingleDeviceFunctionTracker::Init() ()
#31 0x0000561df9ddcbe8 in clang::Sema::MarkDevices() ()
#32 0x0000561df9dd1ebf in clang::Sema::ActOnEndOfTranslationUnitFragment(clang::Sema::TUFragmentKind) ()
#33 0x0000561df9dcebfb in clang::Sema::ActOnEndOfTranslationUnit() ()
#34 0x0000561df9dc9b94 in clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, bool) ()
#35 0x0000561df9dc930a in clang::ParseAST(clang::Sema&, bool, bool) ()
#36 0x0000561df8c4af4e in clang::FrontendAction::Execute() ()
#37 0x0000561df8c4a4b5 in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) ()
#38 0x0000561df9f07427 in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) ()
#39 0x0000561df9f03676 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) ()
#40 0x0000561df9fad9e8 in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) ()
#41 0x0000561df9b9fad6 in main ()

re-attaching a couple of times also shows this tip of the stack:

#0  0x0000561df8ced051 in clang::Decl::getAttrs() const ()
#1  0x0000561dfc6c2e65 in collectSYCLAttributes(clang::Sema&, clang::FunctionDecl*, clang::FunctionDecl const*, llvm::SmallVectorImpl<clang::Attr*>&, clang::Expr const*, bool) ()
#2  0x0000561dfc72354f in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
...
#0  0x0000561dfc72369d in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
#1  0x0000561dfc7236ab in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
...
#0  0x0000561dfc6cdae3 in bool llvm::isa<clang::IntelReqdSubGroupSizeAttr, clang::ReqdWorkGroupSizeAttr, clang::SYCLIntelKernelArgsRestrictAttr, clang::SYCLIntelNumSimdWorkItemsAttr, clang::SYCLIntelSchedulerTargetFmaxMhzAttr, clang::SYCLIntelMaxWorkGroupSizeAttr, clang::SYCLIntelMaxGlobalWorkDimAttr, clang::SYCLIntelNoGlobalWorkOffsetAttr, clang::SYCLSimdAttr, clang::Attr*>(clang::Attr* const&) ()
#1  0x0000561dfc6c2fe5 in std::__1::back_insert_iterator<llvm::SmallVectorImpl<clang::Attr*> > std::__1::copy_if<clang::Attr**, std::__1::back_insert_iterator<llvm::SmallVectorImpl<clang::Attr*> >, collectSYCLAttributes(clang::Sema&, clang::FunctionDecl*, clang::FunctionDecl const*, llvm::SmallVectorImpl<clang::Attr*>&, clang::Expr const*, bool)::$_5>(clang::Attr**, clang::Attr**, std::__1::back_insert_iterator<llvm::SmallVectorImpl<clang::Attr*> >, collectSYCLAttributes(clang::Sema&, clang::FunctionDecl*, clang::FunctionDecl const*, llvm::SmallVectorImpl<clang::Attr*>&, clang::Expr const*, bool)::$_5) ()
#2  0x0000561dfc6c2e77 in collectSYCLAttributes(clang::Sema&, clang::FunctionDecl*, clang::FunctionDecl const*, llvm::SmallVectorImpl<clang::Attr*>&, clang::Expr const*, bool) ()
#3  0x0000561dfc72354f in SingleDeviceFunctionTracker::VisitCallNode(clang::CallGraphNode*, clang::Expr const*, llvm::SmallVectorImpl<clang::FunctionDecl*>&) ()
...

@rscohn2
Copy link
Contributor Author

rscohn2 commented Jul 1, 2021

I reproduced the problem and filed a ticket.

@ax3l
Copy link
Member

ax3l commented Jul 1, 2021

Thanks a lot! 👍

@rscohn2
Copy link
Contributor Author

rscohn2 commented Jul 6, 2021

@ax3l : Here is the fix for the slowdown intel/llvm#4065

@ax3l
Copy link
Member

ax3l commented Jul 7, 2021

@rscohn2 that's fantastic, thanks a lot for triaging this! 🚀 ✨

@rscohn2
Copy link
Contributor Author

rscohn2 commented Jul 20, 2021

@ax3l: I verified that it is working with a compiler release from github: https://github.com/intel/llvm/releases/tag/sycl-nightly%2F20210718

It is likely to be in update 4 because it was fixed relatively early in the release cycle.

@ax3l
Copy link
Member

ax3l commented Jul 20, 2021

@rscohn2 thank you, that's fantastic!

@WeiqunZhang also found a way to reduce our usage of recursive functions at the end of last week via #2063 . So for testing, definitely use the same commit as before since we just merged a commit that changed the behavior of the code for that routine.

@ax3l
Copy link
Member

ax3l commented Aug 5, 2021

@rscohn2 Since about yesterday, we see quite often the error

E: Failed to fetch https://apt.repos.intel.com/oneapi/dists/all/main/binary-all/Packages.bz2  File has unexpected size (14108 != 14031). Mirror sync in progress? [IP: 23.67.98.161 443]
   Hashes of expected file:
    - Filesize:14031 [weak]
    - SHA512:ad515304fdb583aa0f1dc222ff20a07729312f4823517f4e76bdb8c56215a62400422f568e534cb6c02da97773342b2d6ad80ccba5de87ab720c55181458fd69
    - SHA256:967eafcdbc4920aa3f527460b0167b01927d09db9a924ee6b71c9361a145a5c8
    - SHA1:33a7ce12d8c2914321b1afcc426de3cd6db8b6b0 [weak]
    - MD5Sum:106531c8237b370084f8522f5bcc1250 [weak]
   Release file created at: Fri, 30 Jul 2021 18:30:25 +0000
E: Some index files failed to download. They have been ignored, or old ones used instead.

when starting up our Ubuntu CI on GH action and downloading the oneAPI deb packages.

Is it possible the Intel CDN servers are not in consistent state or that a release is going on?
I am restarting quite a lot of jobs per day due to that.

Our current setup looks like this:
https://github.com/ECP-WarpX/WarpX/blob/development/.github/workflows/intel.yml
https://github.com/ECP-WarpX/WarpX/blob/development/.github/workflows/dependencies/dpcpp.sh

@rscohn2
Copy link
Contributor Author

rscohn2 commented Aug 6, 2021

I asked about the failures and will let you know.

GitHub Actions has a feature where you can cache an install. It will speed up install and I suspect will make it more reliable since it saves a tarball in Azure storage. Here is an example:
Add a cache action: https://github.com/KhronosGroup/SYCL_Reference/blob/1b68fb78c263f55ad2ef02d65e1e732210e43357/.github/workflows/checks.yml#L30

And then make the install conditional on the cache restore failing:
https://github.com/KhronosGroup/SYCL_Reference/blob/1b68fb78c263f55ad2ef02d65e1e732210e43357/.github/workflows/checks.yml#L37

Here is a successful restore:
https://github.com/KhronosGroup/SYCL_Reference/runs/3243653747?check_suite_focus=true

@rscohn2
Copy link
Contributor Author

rscohn2 commented Aug 6, 2021

No updates going on. Could it be the disk filling up? It may be that you sometimes get a VM with less disk space. Can you do a df -h immediately before and after?

@rscohn2
Copy link
Contributor Author

rscohn2 commented Aug 6, 2021

It turns out some packages (not the ones you are using) were updated, and the index file can be out of sync. They are trying to improve the reliability. If GitHub caching works for you, it would probably avoid the problem.

@ax3l
Copy link
Member

ax3l commented Aug 6, 2021

@rscohn2 thank you, that's good to know! Thank you also for looking into the sync/reliability on updates of the index.

I was not using caching yet so we get a new release when it drops, but that's a good idea to consider.

@rscohn2
Copy link
Contributor Author

rscohn2 commented Nov 15, 2022

@ax3l: I published a github action that does oneapi install with cacheing and pruning of unnecessary files. Unless you request a specific version, you will get the latest. https://github.com/marketplace/actions/setup-oneapi. It will cache /opt/intel/oneapi at the end of the action so you can manually prune as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: sycl Specific to DPC++/SYCL execution (CPUs/GPUs) component: tests Tests and CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants