-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Particle Container to Pure SoA #3850
Conversation
*/ | ||
struct PIdx | ||
{ | ||
enum { | ||
w = 0, ///< weight | ||
#if !defined (WARPX_DIM_1D) | ||
x, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could even call this r
for RZ sims :)
y, | ||
#endif | ||
z, | ||
w, ///< weight | ||
ux, uy, uz, | ||
#ifdef WARPX_DIM_RZ | ||
theta, ///< RZ needs all three position components |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could move this forward in the enum to r,z,theta
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And even better, we should make theta
a runtime parameter (see reason in AMReX-Codes/pyamrex#243)
427aaa5
to
11adef6
Compare
53725c5
to
5055fef
Compare
Source/Particles/Collision/BinaryCollision/ParticleCreationFunc.H
Outdated
Show resolved
Hide resolved
103a487
to
08b3ff2
Compare
Transition particle containers to pure SoA layouts. Co-authored-by: Andrew Myers <atmyers@lbl.gov>
08b3ff2
to
2ba8ec0
Compare
2ba8ec0
to
eb6dbe9
Compare
Congrats to all who helped @Thierry992, @atmyers, @AlexanderSinn et al. 🎉 👏 |
@atmyers @roelof-groenewald feel free to give this a final review and then we will merge :) |
Congrats! This is awesome! Besides the (upcoming) performance improvements, it will make writing particle routines much easier going forward. |
Yes, and as nice Python GPU goodies we can now:
|
This reverts commit 94ae119.
Transition to new, purely SoA particle containers. This was originally merged in ECP-WarpX#3850 and reverted in ECP-WarpX#4652, since we discovered issues loosing particles & laser particles on GPU.
Transition to new, purely SoA particle containers. This was originally merged in ECP-WarpX#3850 and reverted in ECP-WarpX#4652, since we discovered issues loosing particles & laser particles on GPU.
Transition to new, purely SoA particle containers. This was originally merged in ECP-WarpX#3850 and reverted in ECP-WarpX#4652, since we discovered issues loosing particles & laser particles on GPU.
Transition to new, purely SoA particle containers. This was originally merged in ECP-WarpX#3850 and reverted in ECP-WarpX#4652, since we discovered issues loosing particles & laser particles on GPU.
Transition to new, purely SoA particle containers. This was originally merged in ECP-WarpX#3850 and reverted in ECP-WarpX#4652, since we discovered issues loosing particles & laser particles on GPU.
Transition to new, purely SoA particle containers. This was originally merged in ECP-WarpX#3850 and reverted in ECP-WarpX#4652, since we discovered issues loosing particles & laser particles on GPU.
Merge coming for the 24.03 release series in #4653 |
* AMReX & pyAMReX: Latest `development` More pure SoA and id handling goodness. * Particle Container to Pure SoA Again Transition to new, purely SoA particle containers. This was originally merged in #3850 and reverted in #4652, since we discovered issues loosing particles & laser particles on GPU. * Modernize `idcpu` Treatment - faster: less emitted operations, no jumps - cheaper: less used registers - safer: no read-before-write warnings - cooler: no explanation needed
## Summary For performance reasons, `int`/`long` are better index types since they do not have over/underflow checks and thus vectorize better. Also, I see narrowing warnings casting from `int` to AMReX' `unsigned int` in WarpX. ## Additional background Seen with clang-tidy in ECP-WarpX/WarpX#3850 ## Checklist The proposed changes: - [ ] fix a bug or incorrect behavior in AMReX - [ ] add new capabilities to AMReX - [ ] changes answers in the test suite to more than roundoff level - [ ] are likely to significantly affect the results of downstream AMReX users - [ ] include documentation in the code and/or rst files, if appropriate
Transition particle containers to pure SoA layouts.
Increment
,IncrementWithTotal
,Checkpoint
,WritePlotFile
,Restart
; SoA ParticleatomicSetID
Python_restart_runtime_components
(in AMReX restarts)Fun Mini-Benchmarks on CPU, DP & SP
Same before/after PR on my laptop
We expect to need to do more work for vectorizing in the coming months.
Fun Mini-Benchmarks on GPU, DP
~1-2%ish faster total RT on Perlmutter A100.
Current deposition: same
GatherAndPush: 1-2%
Redistribute_partition: 12%
AddPlasma: same
PushP: same
SortParticlesForDeposition: 230%