scalability issue in MCT when called from CPL:RUN #1107

worleyph · 2016-10-20T15:15:58Z

This issue is to document MCT PR number 38.

For high resolution runs of ACME using large MPI process counts, time spent in CPL:RUN can be very large (larger than the cost of ATM on Mira, for example). Upon investigation, it was determined that the existing MPI algorithms in the rearrange_ routine can be inefficient. The swapm variant of the MPI_AlltoallV operator was ported from PIO1 and modified to work in the MCT environment. The option to call this was then added to rearrange_ for calls originating from sMatAvMult_SMPlus_ . Experiments indicate that it is not as efficient to apply this change to all calls to rearrange_ , but other call sites may find it advantageous to use this new algorithm - TBD. In at least one high res. case, using the swapm algorithm decreases CPL:RUN cost by a factor of 5 at high process counts. This will be critical for near term production runs.

Long term, the routines calling rearrange_ should be modified to allow the user to specify an MPI algorithm and communication protocol, but this will require more extensive modifications to MCT.

The MCT PR is a bit rough, and MCT developers may need to rework this a bit to make it conform to MCT coding style and conventions. Hopefully it is a high priority for @rljacob :-).

bishtgautam · 2016-11-30T17:24:56Z

Updated the comment to include url to MCT PR

worleyph · 2017-07-04T00:40:01Z

I believe that this optimization has been merged into ACME from MCT.

worleyph added enhancement Critical BFB PR leaves answers BFB Coupled Model CIME labels Oct 20, 2016

worleyph assigned rljacob Oct 20, 2016

worleyph mentioned this issue Oct 20, 2016

scalability issue in MCT when initializing ROF #1108

Closed

worleyph closed this as completed Jul 4, 2017

rljacob mentioned this issue Jul 4, 2017

Update2 MCT subtree to MCT_2.10.beta #1452

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scalability issue in MCT when called from CPL:RUN #1107

scalability issue in MCT when called from CPL:RUN #1107

worleyph commented Oct 20, 2016 •

edited by bishtgautam

Loading

bishtgautam commented Nov 30, 2016

worleyph commented Jul 4, 2017

scalability issue in MCT when called from CPL:RUN #1107

scalability issue in MCT when called from CPL:RUN #1107

Comments

worleyph commented Oct 20, 2016 • edited by bishtgautam Loading

bishtgautam commented Nov 30, 2016

worleyph commented Jul 4, 2017

worleyph commented Oct 20, 2016 •

edited by bishtgautam

Loading