-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: Trap NaNs, Divisions by Zero, Overflows #2205
Conversation
CI tests that crashed or failed:
|
Regarding the CI test collisionXZ, I ran it locally, on this branch, with WarpX/Source/Particles/Collision/ElasticCollisionPerez.H Lines 97 to 98 in 51c42e3
I think the same is happening with the CI test collisionXYZ. @RemiLehe @Yin-YinjianZhao Update Bug fix in #2225. |
Regarding the CI test momentum-conserving-gather there are some NaNs in the arrays For example, looking at WarpX/Source/Parallelization/WarpXComm.cpp Line 213 in 263d962
and bx_c is computed from Btmp hereWarpX/Source/Parallelization/WarpXComm.cpp Line 205 in 263d962
which, in turn, is computed here WarpX/Source/Parallelization/WarpXComm.cpp Lines 181 to 189 in 263d962
So I think these NaNs have something to do with the recent changes in #2144. I can confirm that the NaNs disappear if we revert the changes in #2144 (namely if we replace |
@EZoni Thanks for adding this PR, that's a great idea! |
@NeilZaim Thank you for your comments. Your solution seems like a logical and robust one, even though it might come at a small performance price, as you say. One alternative, since as you noted this likely does not affect the valid cells, is simply to make sure that the |
Regarding the CI test LaserIonAcc2d, the value
and computed shortly above as GetPosition(i, x, y, z); seems to be NaN, already for i = 0 and right when these diags are called at initialization, within WarpX::InitData .
@ax3l @atmyers I think it might have something to do with what happens inside |
Regarding the CI tests ionization_lab and ionization_boost, there is a division by zero here: WarpX/Source/Particles/ElementaryProcess/Ionization.H Lines 127 to 129 in 263d962
Does anybody know how this formula should be modified when Update Bug fix in #2214. |
Awesome, love it! 🤩
I think that's a general 2D issue with filters: In
but maybe the parser still touches/copies the scalar |
Yes, in Parser's opearator(), all arguments are copied into an internal array. That will touch |
Regarding the CI tests embedded_boundary_cube and particle_absorption, the arrays WarpX/Source/FieldSolver/FiniteDifferenceSolver/EvolveB.cpp Lines 118 to 120 in 263d962
have NaNs in certain slices. For example, for embedded_boundary_cube I see Sx(48,:,:) = nan , Sy(:,48,:) = nan and Sz(:,:,48) = nan , and for particle_absorption I see Sx(64,:,:) = nan , Sy(:,64,:) = nan and Sz(:,:,64) = nan .
@lgiacome Would you be able to understand why this happens (namely why those last slices in the arrays inside Some more hints: WarpX/Source/EmbeddedBoundary/WarpXInitEB.cpp Lines 188 to 191 in 263d962
is missing the last points that correspond to the uninitialized slices above. Is it possible that amrex::convert(box, amrex::Box(face).ixType()) is not the right box (with the right index type) that we want to use here?
|
@EZoni Thanks for pointing this out. I remember we set them to be -1 on purpose before. But apparently it is not good even if it does not affect the algorithm later on. I will try to modify something to avoid NaNs. |
Hi @EZoni! |
Thank you, @lgiacome! I just tested the PR for the ECT solver for this particular issue, by running the CI tests embedded_boundary_cube and particle_absorption with the runtime option |
I work on two fixes for LaserIonAcc2d:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for proposing and triaging this, that's fantastic! 🎉
Thank you to everyone involved in the rapid bug fixes 🚀 ✨
Trap NaNs, divisions by zero, and overflows when running our CI tests.