Skip to content

Commit

Permalink
Workaround to bypass issue observed at very large scale with Fujitsu …
Browse files Browse the repository at this point in the history
…MPI (AMReX-Codes#2874)

We have observed some MPI issues at very large scale when WarpX is compiled using Fujitsu MPI (i.e., with the Fujitsu compiler). These issues seem to be related to the use of MPI Gatherv with MPI_Datatype. This PR implements a possible workaround, initially proposed by @WeiqunZhang . The idea is that, when WarpX is compiled with the Fujitsu compiler, simpler integer arrays instead of MPI_Datatype are used in the routine where the issue was observed.
  • Loading branch information
lucafedeli88 authored Jul 8, 2022
1 parent 7660c88 commit a633d2b
Showing 1 changed file with 17 additions and 0 deletions.
17 changes: 17 additions & 0 deletions Src/AmrCore/AMReX_TagBox.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -649,7 +649,24 @@ TagBoxArray::collate (Gpu::PinnedVector<IntVect>& TheGlobalCollateSpace) const
//
const IntVect* psend = (count > 0) ? TheLocalCollateSpace.data() : nullptr;
IntVect* precv = TheGlobalCollateSpace.data();

//Issues have been observed with the following call at very large scale when using
//FujitsuMPI. The issue seems to be related to the use of MPI_Datatype. We can
//bypasses the issue by exchanging simpler integer arrays.
#ifndef __FUJITSU
ParallelDescriptor::Gatherv(psend, count, precv, countvec, offset, IOProcNumber);
#else
const int* psend_int = psend->begin();
int* precv_int = precv->begin();
Long count_int = count * AMREX_SPACEDIM;
auto countvec_int = std::vector<int>(countvec.size());
auto offset_int = std::vector<int>(offset.size());
const auto mul_funct = [](const auto el){return el*AMREX_SPACEDIM;};
std::transform(countvec.begin(), countvec.end(), countvec_int.begin(), mul_funct);
std::transform(offset.begin(), offset.end(), offset_int.begin(), mul_funct);
ParallelDescriptor::Gatherv(
psend_int, count_int, precv_int, countvec_int, offset_int, IOProcNumber);
#endif

#else
TheGlobalCollateSpace = std::move(TheLocalCollateSpace);
Expand Down

0 comments on commit a633d2b

Please sign in to comment.