-
Notifications
You must be signed in to change notification settings - Fork 0
Threading optimization: bfb framework changes by Abhinav #1237
Threading optimization: bfb framework changes by Abhinav #1237
Conversation
@akturner @mgduda @matthewhoffman Could each of you please do a test merge (or cherry pick) of these changes and make sure they pass your standard tests? It was bit-for-bit on the ocean model nightly regression suite using gnu. |
This passed the MPAS-Seaice tests for bfb. |
This passes MPASLI bfb. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes look fine to me. I'm able to get bit-identical results with these changes, and there are apparently no bad interactions with the OpenMP strategy that we're using in MPAS-Atmosphere. I haven't found a measurable speedup with these changes, but this is likely because MPAS-Atmosphere is not using any scratch fields or the "new" halo exchange methods at present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have any threaded code in the LI core, so the fact that this passes the LI regression suite is sufficient reason for me to approve this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests bfb with the sea ice core. Sea ice has no threading so otherwise doesnt affect us.
This commit DID pass the ACME PET test, which means a bit-for-bit comparison with and without threading. Horray! process: I started with the ACME-Dev/master, it does pass PET. Added this single commit to the MPAS-O submodule, and it again passes PET Here are the locations:
WAIT - I did not update all three cores. Rerunning test, we shall see... |
This only updates to: Threading optimization: bfb framework changes by Abhinav #1237 MPAS-Dev/MPAS#1237
8019269
to
62e69d7
Compare
|
@@ -265,25 +266,26 @@ subroutine mpas_reconstruct_2d(meshPool, u, uReconstructX, uReconstructY, uRecon | |||
enddo ! iCell | |||
!$omp end do | |||
|
|||
call mpas_threading_barrier() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@philipwjones From my git diff command, it looks like mpas_vector_reconstruction.F
was the only file you changed in #1285. Specifically, this threading barrier is the only substantive line, others are just spacing. Please check that these changes look like your intended changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct @mark-petersen . Sorry for the cosmetic changes - guess I'm a little obsessive-compulsive and can't resist lining some things up. Looking at the other ocean-related bfb mods, I suspect your previous non-passing of PET ACME tests were possibly related to your branch not having some of the updated mods from the previous non-deterministic fixes. At least one of those mods was in that commit. Anyway, have moved on to testing the mods for the ocean bfb threading PR next.
… ocean/develop With the addition of a new threaded reconstruction routine in #1237, openMP directives surrounding calls to this routine in the ocean core needed to be removed for consistency.
@philipwjones just to verify, this updated version, along with #1286, passed the ACME PET test on all the machines? Are you done testing? |
Still trying to get a working configuration on Titan - CIME5.2 still has a few bugs. Hope to have something running pretty soon. I'm sufficiently confident though, so if you need to go ahead with this, I think it'll be fine. |
We group our MPAS updates into ACME, which makes rolling back a single PR difficult. I will wait until you've completed the Titan tests. That has the additional advantage that if you have some minor updates to this PR required by Titan, you can simply push it to my fork as an extra commit. |
@mark-petersen The previous commits in MPAS (framework + consistency mods) pass the PET tests on Titan so you can go ahead with those. However, one of the routines in the second round of ocean b4b mods is failing. Have it narrowed down to somewhere in the barotropic solve mods, so will hopefully have that second round of threading improvements ready to go early next week. |
Great, thanks a lot @philipwjones. The log module should be merged onto ACME Monday. Then I will rebase and do a few more quick tests. Then I will rebase the other two PRs as well. Anything fixes you find before then can just be pushed to my branches with an additional commit. |
This adds an mpas_threading_barrier. Originally added in PR #1285
62e69d7
to
28487f7
Compare
Rebased and tested again, due to substantial changes with new log module since this was created. |
merged into ACME via PR #1630 |
This PR replaces #1151, and includes only bfb changes to framework. This includes: