Extend the number of unique particles per cpu we can have at once. #1315

atmyers · 2020-08-26T23:01:18Z

Currently, we use two signed integers to store id numbers for each particle as well as the rank it was generated on. This allows unique combinations of 'id', and 'cpu' numbers to be generated without any communication between ranks. However, this does waste some space, since it's unlikely that 2**31-1 MPI ranks will be used any time soon, while the same limit for the id has actually been overflowed in real-world WarpX simulations. To address this, in this PR, we still use 64 bits to represent the combination of (id, cpu), but we devote 40 bits to the id and only 24 to the cpu. This allows ~0.5 trillion unique particles on each of 16.7 million MPI ranks, which should be good enough for the foreseeable future.

The proposed changes:

fix a bug or incorrect behavior in AMReX
add new capabilities to AMReX
changes answers in the test suite to more than roundoff level
are likely to significantly affect the results of downstream AMReX users
are described in the proposed changes to the AMReX documentation, if appropriate

…rticle as well as the rank it was generated on. This allows unique combinations of 'id', and 'cpu' numbers to be generated without any communication between ranks. However, this does waste some space, since it's unlikely that 2**31-1 MPI ranks will be used any time soon, while the same limit for the id has actually been overflowed in real-world WarpX simulations. To address this, in this PR, we still use 64 bytes to represent the combination of (id, cpu), but we devote 40 bits to the id and only 24 to the cpu. This allows ~0.5 trillion unique particles on each of 16.7 million MPI ranks, which should be good enough for the forseable future.

MaxThevenet · 2020-08-27T07:10:43Z

On the test I am running, running on this branch (the standard output contains AMReX (20.08-97-ge99860c8e34c) initialized), I still get the same error at the same time step

STEP 13749 starts ...
amrex::Abort::1::ERROR: overflow on particle id numbers !!!
SIGABRT

See the Backtrace.

Currently this test crashes after 30 min on 2 V100 GPUs. I can make a reproducer that crashes faster.

MaxThevenet · 2020-08-27T08:02:39Z

@atmyers This input file is a single-GPU WarpX reproducer where the issue comes after 1802 iterations (2 minutes on 1 V100). Here the number of particles injected should be roughly 256 * 256 * 32 * 1802 = 3.7 billions.

If you want to make it even faster, you can increase the number of ppc for plasma_e. If you encounter memory issues, you can decrease the number of cells longitudinally AND decrease the physical size of the domain in the longitudinal direction accordingly (so dz remains small, so you still inject ~1 cell per time step).

atmyers · 2020-08-27T15:28:07Z

Yes, changes are need on the WarpX side as well, which still uses int for the pid. This just removes the restriction from AMReX.

atmyers · 2020-08-27T15:45:09Z

I believe this PR to WarpX should do it: ECP-WarpX/WarpX#1266

WeiqunZhang · 2020-08-27T16:16:36Z

Src/Particle/AMReX_Particle.H

+        // zero out the first 24 bits, which are used to store the cpu number
+        m_idata &= (~ 0x00FFFFFF);
+
+        AMREX_ASSERT(cpu > 0);


MaxThevenet · 2020-08-27T16:18:46Z

Oh yes, of course, I just resubmitted the test on both branches. Thanks!

WeiqunZhang · 2020-08-27T16:52:39Z

Src/Particle/AMReX_Particle.H

+    }
+
+    AMREX_GPU_HOST_DEVICE
+    operator long () noexcept


This could be const function.

WeiqunZhang · 2020-08-27T16:53:28Z

Src/Particle/AMReX_Particle.H

+    }
+
+    AMREX_GPU_HOST_DEVICE
+    operator int () noexcept


Could be a const function.

MaxThevenet · 2020-08-27T17:12:50Z

Yes, this fixes the issue for me. Thanks!

…MReX-Codes#1315) Currently, we use two signed integers to store id numbers for each particle as well as the rank it was generated on. This allows unique combinations of 'id', and 'cpu' numbers to be generated without any communication between ranks. However, this does waste some space, since it's unlikely that 2**31-1 MPI ranks will be used any time soon, while the same limit for the id has actually been overflowed in real-world WarpX simulations. To address this, in this PR, we still use 64 bits to represent the combination of (id, cpu), but we devote 40 bits to the id and only 24 to the cpu. This allows ~0.5 trillion unique particles on each of 16.7 million MPI ranks, which should be good enough for the foreseeable future. The proposed changes: - [ ] fix a bug or incorrect behavior in AMReX - [x] add new capabilities to AMReX - [ ] changes answers in the test suite to more than roundoff level - [ ] are likely to significantly affect the results of downstream AMReX users - [ ] are described in the proposed changes to the AMReX documentation, if appropriate

…once. (AMReX-Codes#1315)" This reverts commit d91e0ae.

…MReX-Codes#1315) Currently, we use two signed integers to store id numbers for each particle as well as the rank it was generated on. This allows unique combinations of 'id', and 'cpu' numbers to be generated without any communication between ranks. However, this does waste some space, since it's unlikely that 2**31-1 MPI ranks will be used any time soon, while the same limit for the id has actually been overflowed in real-world WarpX simulations. To address this, in this PR, we still use 64 bits to represent the combination of (id, cpu), but we devote 40 bits to the id and only 24 to the cpu. This allows ~0.5 trillion unique particles on each of 16.7 million MPI ranks, which should be good enough for the foreseeable future. The proposed changes: - [ ] fix a bug or incorrect behavior in AMReX - [x] add new capabilities to AMReX - [ ] changes answers in the test suite to more than roundoff level - [ ] are likely to significantly affect the results of downstream AMReX users - [ ] are described in the proposed changes to the AMReX documentation, if appropriate

atmyers requested a review from WeiqunZhang August 26, 2020 23:01

atmyers added 4 commits August 26, 2020 16:16

some int -> Long

b90f15a

remove extra ;

c60f9b9

some AMREX_GPU_HOST_DEVICE and noexcept

caba883

int64_t -> uint64_t

e99860c

WeiqunZhang reviewed Aug 27, 2020

View reviewed changes

atmyers added 2 commits August 27, 2020 09:37

use operator Long insead of long

042017e

> -> >=

7872ec0

WeiqunZhang reviewed Aug 27, 2020

View reviewed changes

Src/Particle/AMReX_Particle.H Outdated

}

AMREX_GPU_HOST_DEVICE

operator long () noexcept

Copy link

Member

WeiqunZhang Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be const function.

WeiqunZhang reviewed Aug 27, 2020

View reviewed changes

Src/Particle/AMReX_Particle.H

}

AMREX_GPU_HOST_DEVICE

operator int () noexcept

Copy link

Member

WeiqunZhang Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be a const function.

atmyers added 3 commits August 27, 2020 10:00

make the operator Long () functions const

0511fdf

add ()

70e7d7e

cast to long before taking negative

91e560e

WeiqunZhang approved these changes Aug 27, 2020

View reviewed changes

WeiqunZhang merged commit 5e9dbc1 into AMReX-Codes:development Aug 27, 2020

ax3l mentioned this pull request Aug 31, 2020

Only tag particles for splitting when we change levels if splitting is on. ECP-WarpX/WarpX#1276

Merged

atmyers mentioned this pull request Aug 31, 2020

Update benchmarks for id changes ECP-WarpX/WarpX#1278

Merged

kweide added a commit to ECP-Astro/amrex that referenced this pull request Sep 28, 2020

Revert "Extend the number of unique particles per cpu we can have at …

174b6ab

…once. (AMReX-Codes#1315)" This reverts commit d91e0ae.

sayerhs mentioned this pull request Nov 14, 2020

d/fcompare fix sayerhs/amrex#1

Closed

atmyers mentioned this pull request Sep 30, 2022

Python Interface needs to be updated for expanded particle range ECP-WarpX/WarpX#3443

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend the number of unique particles per cpu we can have at once. #1315

Extend the number of unique particles per cpu we can have at once. #1315

atmyers commented Aug 26, 2020 •

edited by WeiqunZhang

Loading

MaxThevenet commented Aug 27, 2020

MaxThevenet commented Aug 27, 2020

atmyers commented Aug 27, 2020

atmyers commented Aug 27, 2020

WeiqunZhang Aug 27, 2020 •

edited

Loading

MaxThevenet commented Aug 27, 2020

WeiqunZhang Aug 27, 2020

WeiqunZhang Aug 27, 2020

MaxThevenet commented Aug 27, 2020

Extend the number of unique particles per cpu we can have at once. #1315

Extend the number of unique particles per cpu we can have at once. #1315

Conversation

atmyers commented Aug 26, 2020 • edited by WeiqunZhang Loading

MaxThevenet commented Aug 27, 2020

MaxThevenet commented Aug 27, 2020

atmyers commented Aug 27, 2020

atmyers commented Aug 27, 2020

WeiqunZhang Aug 27, 2020 • edited Loading

Choose a reason for hiding this comment

MaxThevenet commented Aug 27, 2020

WeiqunZhang Aug 27, 2020

Choose a reason for hiding this comment

WeiqunZhang Aug 27, 2020

Choose a reason for hiding this comment

MaxThevenet commented Aug 27, 2020

atmyers commented Aug 26, 2020 •

edited by WeiqunZhang

Loading

WeiqunZhang Aug 27, 2020 •

edited

Loading