Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++17: Work-Around NVCC gatherParticles #2596

Merged
merged 1 commit into from
Nov 24, 2021

Conversation

ax3l
Copy link
Member

@ax3l ax3l commented Nov 23, 2021

The noexcept lambda does not compile in C++17 mode (#2300) due to an NVCC compiler bug, at least in NVCC 11.3.109.

Compiles in C++14 mode with the same compiler.

Error:

nvcc_internal_extended_lambda_implementation:732:77: error: ‘operator()’ is not a member of ‘int (ParticleBoundaryBuffer::gatherParticles(MultiParticleContainer&, const amrex::Vector<const amrex::MultiFab*>&)::<lambda(const SrcData&, int)>::*)(const amrex::ConstParticleTileData<0, 0, 4, 0>&, int) const noexcept’
Source/Particles/ParticleBoundaryBuffer.cpp: In member function ‘void ParticleBoundaryBuffer::gatherParticles(MultiParticleContainer&, const amrex::Vector<const amrex::MultiFab*>&)’:
Source/Particles/ParticleBoundaryBuffer.cpp:245:34: error: no matching function for call to ‘__nv_hdl_create_wrapper_t<false, false, __nv_dl_tag<void (ParticleBoundaryBuffer::*)(MultiParticleContainer&, const amrex::Vector<const amrex::MultiFab*>&), &ParticleBoundaryBuffer::gatherParticles, 1>, const GetParticlePosition, amrex::Array4<const float>, amrex::GpuArray<float, 3>, amrex::GpuArray<float, 3> >::__nv_hdl_create_wrapper(ParticleBoundaryBuffer::gatherParticles(MultiParticleContainer&, const amrex::Vector<const amrex::MultiFab*>&)::<lambda(const SrcData&, int)>, const GetParticlePosition&, amrex::Array4<const float>&, amrex::GpuArray<float, 3>&, amrex::GpuArray<float, 3>&)’
  245 |                     },
      |                                  ^

Thx to @atmyers for the hint :)

To Do

  • file an Nvidia bug report (no.: 3447924) and ping @maxpkatz on it :)
cmake -S . -B build -DWarpX_COMPUTE=CUDA -DWarpX_OPENPMD=ON -DAMReX_CUDA_ARCH=6.0 -DWarpX_EB=ON -DWarpX_PRECISION=SINGLE -DWarpX_PSATD=ON -DAMReX_CUDA_ERROR_CROSS_EXECUTION_SPACE_CALL=ON -DAMReX_CUDA_ERROR_CAPTURE_THIS=ON -DCMAKE_VERBOSE_MAKEFILE=ON
cmake --build build

The `noexcept` lambda does not compile in C++17 mode due to an NVCC
compiler bug, at least in NVCC 11.3.109.

Compiles in C++14 mode with the same compiler.

Error:
```
nvcc_internal_extended_lambda_implementation:732:77: error: ‘operator()’ is not a member of ‘int (ParticleBoundaryBuffer::gatherParticles(MultiParticleContainer&, const amrex::Vector<const amrex::MultiFab*>&)::<lambda(const SrcData&, int)>::*)(const amrex::ConstParticleTileData<0, 0, 4, 0>&, int) const noexcept’
Source/Particles/ParticleBoundaryBuffer.cpp: In member function ‘void ParticleBoundaryBuffer::gatherParticles(MultiParticleContainer&, const amrex::Vector<const amrex::MultiFab*>&)’:
Source/Particles/ParticleBoundaryBuffer.cpp:245:34: error: no matching function for call to ‘__nv_hdl_create_wrapper_t<false, false, __nv_dl_tag<void (ParticleBoundaryBuffer::*)(MultiParticleContainer&, const amrex::Vector<const amrex::MultiFab*>&), &ParticleBoundaryBuffer::gatherParticles, 1>, const GetParticlePosition, amrex::Array4<const float>, amrex::GpuArray<float, 3>, amrex::GpuArray<float, 3> >::__nv_hdl_create_wrapper(ParticleBoundaryBuffer::gatherParticles(MultiParticleContainer&, const amrex::Vector<const amrex::MultiFab*>&)::<lambda(const SrcData&, int)>, const GetParticlePosition&, amrex::Array4<const float>&, amrex::GpuArray<float, 3>&, amrex::GpuArray<float, 3>&)’
  245 |                     },
      |                                  ^
```
@ax3l ax3l added backend: cuda Specific to CUDA execution (GPUs) workaround labels Nov 23, 2021
@ax3l ax3l requested a review from atmyers November 23, 2021 18:16
@EZoni EZoni merged commit 7a3283c into ECP-WarpX:development Nov 24, 2021
@ax3l ax3l deleted the workaround-nvccNoExceptCXX17 branch November 24, 2021 16:40
@ax3l ax3l mentioned this pull request Nov 24, 2021
25 tasks
@@ -232,7 +232,8 @@ void ParticleBoundaryBuffer::gatherParticles (MultiParticleContainer& mypc,
int timestep = warpx_instance.getistep(0);
using SrcData = WarpXParticleContainer::ParticleTileType::ConstParticleTileDataType;
auto count = amrex::filterAndTransformParticles(ptile_buffer, ptile,
[=] AMREX_GPU_HOST_DEVICE (const SrcData& /*src*/, const int ip) noexcept
[=] AMREX_GPU_HOST_DEVICE (const SrcData& /*src*/, const int ip)
/* NVCC 11.3.109 chokes in C++17 on this: noexcept */
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitted as Nvidia developer bug no. 3447924

dpgrote pushed a commit to dpgrote/WarpX that referenced this pull request Nov 29, 2021
The `noexcept` lambda does not compile in C++17 mode due to an NVCC compiler bug, at least in NVCC 11.3.109. Compiles in C++14 mode with the same compiler.
@ax3l ax3l mentioned this pull request Nov 30, 2021
roelof-groenewald added a commit to ModernElectron/WarpX that referenced this pull request Dec 3, 2021
* Fix Init of Vector Members (ECP-WarpX#2595)

Fix default init of `Vector` member variables. The old construct
is not valid C++.

https://stackoverflow.com/a/11491003/2719194

* C++17: Work-Around NVCC gatherParticles (ECP-WarpX#2596)

The `noexcept` lambda does not compile in C++17 mode due to an NVCC compiler bug, at least in NVCC 11.3.109. Compiles in C++14 mode with the same compiler.

* requirements.txt - PICMI development version (ECP-WarpX#2588)

Document in `requirements.txt` on how to install a pre-release
version of PICMI.

* CONTRIBUTING: Update/Modernize (ECP-WarpX#2600)

* CONTRIBUTING: Update/Modernize

- Add GitHub account setup
- Add local git setup
- Modernize for CMake workflows

* Apply suggestions by Edoardo

Co-authored-by: Edoardo Zoni <59625522+EZoni@users.noreply.github.com>

* Replaced duplicated current deposition documentation (ECP-WarpX#2604)

* Throwing a warning if particle_shape>1 with EB (ECP-WarpX#2592)

* Aborting if particle_shape!=1 with EB

* Throw warning instead of aborting

* Checking at runtime if EB is initialized

* Added missing preprocessor directive

* Ignoring an unused variable

* Fix typo

* Improve style

Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>

* Fix segfault when importing _libwarpx without initializing WarpX (ECP-WarpX#2580)

* Added check for if warpx was initialized when calling finalize

* Renamed to be warpx_initialized

* Fixed reference to global variable

Co-authored-by: Peter Scherpelz <31747262+peterscherpelz@users.noreply.github.com>

* Changed global variable to member of libwarpx

* Fixed syntax errors

* Remove custom arg from argv to avoid parmparse error

Co-authored-by: Peter Scherpelz <31747262+peterscherpelz@users.noreply.github.com>

* Added parallel pragma to ApplyBoundaryConditions (ECP-WarpX#2612)

* Note that CCache 4.2 introduced large CUDA improvements (ECP-WarpX#2606)

* Dimensionality Docs: Default (ECP-WarpX#2609)

Just adds the note that 3D is the default geometry.

* AMReX: Weekly Update (ECP-WarpX#2613)

* MergeBuffersForPlotfile: Barrier (ECP-WarpX#2608)

Make sure that all MPI ranks are in sync, i.e., have closed the
files that they wrote, before trying to merge them.

* Fix installation location for libraries (ECP-WarpX#2583)

During configuration the installation location for libraries is given by
dumping the cmake variable `CMAKE_INSTALL_LIBDIR`.
This commit adjusts the installation of WarpX libraries (WarpX_LIB=ON)
to respect this setting.

Co-authored-by: Rolf Pfeffertal <tropf@users.noreply.github.com>

* Release 21.12 (ECP-WarpX#2614)

* AMReX: 21.12

* PICSAR: 21.12

* WarpX: 21.12

* div(E,B) Cleaning Options for PSATD (ECP-WarpX#2403)

* Implement div(E)/div(B) Cleaning with Standard PSATD

* Cleaning

* Update Benchmark

* Add Nodal Synchronization of F,G

* OneStep_multiJ: Nodal Syncs, Damp PML

* OneStep_multiJ: Push PSATD Fields in PML

* div Cleaning Defaults (Domain v. PML)

* Include Fix of ECP-WarpX#2429 until Merged

* Reset Benchmark of Langmuir_multi_psatd_div_cleaning

* Multi-J: Remove PML Support

* Include Fix of ECP-WarpX#2474 Until Merged

* Exchange All Guard Cells for F,G

* Fix Defaults

* Update Test, Reset Benchmark

* Fix Defaults

* Cleaning

* Default update_with_rho=1 if do_dive_cleaning=1

* Update CI Test pml_psatd_dive_divb_cleaning

* Replace Warning with Abort

* Add 2D Langmuir Test w/ MR & PSATD (ECP-WarpX#2605)

* Add 2D Langmuir Test w/ MR & PSATD

* Add Missing Compile String

* Fix out-of-bound in Inverse FFT of F,G (ECP-WarpX#2619)

* Mention that the potentail should be constant inside EB (ECP-WarpX#2618)

* Mention that the potentail should be constant inside EB

* Update text

* Replace AMREX_SPACEDIM: Boundary & Parallelization (ECP-WarpX#2620)

* AMREX_SPACEDIM : Boundary Conditions
* AMREX_SPACEDIM : Parallelization
* Fix compilation
* Update Source/Parallelization/WarpXComm_K.H

* Fix out-of-bound in the initialization of EB (ECP-WarpX#2607)

* Call FillBoundary when initializing EB

* Avoid out-of-bound

* Bug fix

* Apply suggestions from code review

* update version number

Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
Co-authored-by: Edoardo Zoni <59625522+EZoni@users.noreply.github.com>
Co-authored-by: Remi Lehe <remi.lehe@normalesup.org>
Co-authored-by: Lorenzo Giacomel <47607756+lgiacome@users.noreply.github.com>
Co-authored-by: Kevin Z. Zhu <86268612+KZhu-ME@users.noreply.github.com>
Co-authored-by: Peter Scherpelz <31747262+peterscherpelz@users.noreply.github.com>
Co-authored-by: David Grote <grote1@llnl.gov>
Co-authored-by: Phil Miller <phil@intensecomputing.com>
Co-authored-by: s9105947 <80697868+s9105947@users.noreply.github.com>
Co-authored-by: Rolf Pfeffertal <tropf@users.noreply.github.com>
Co-authored-by: Prabhat Kumar <89051199+prkkumar@users.noreply.github.com>
@ax3l
Copy link
Member Author

ax3l commented Feb 2, 2023

Good news! Nvidia reported that they fixed the issue (3447924) in a "future release" as of May 16, 2022.

So should be gone with CUDA 12+ 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: cuda Specific to CUDA execution (GPUs) workaround
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants