Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add command-line args to the Step class and a ModelStep base class #430

Closed
wants to merge 14 commits into from

Conversation

xylar
Copy link
Collaborator

@xylar xylar commented Aug 17, 2022

For task parallelism, it is necessary for steps to use a consistent number of resources. A compass step can no longer have a python piece followed by an MPI-parallel piece.

Currently, combination python and MPI steps are common in compass. Many such steps involve setting up a model run in some way in python before running the model with MPI. More complex steps may run the model multiple times, performing python operations between each model run. Other such steps include those that generate mapping files (using ESMF_RegridWeightGen run with MPI).

To support task parallelism, it will be necessary to break these steps into smaller steps, each either running python code or MPI code. This is necessary because each step will be called with srun and the specified resources (e.g. ntasks, cpus_per_task, or sometimes both). The python code must be run with ntasks = 1, while the MPI code typically is run with ntasks > 1. @altheaden and I took a good deal of time to investigate breaking steps in to "substeps" to support different numbers of resources. Ultimately, the complexity of both the concepts and the code itself was worse than if we just break current steps into multiple steps with unique resources.

As a significant step toward this goal, this PR:

  1. Adds an args attribute to the Step class. If args is defined, the step's run() method isn't called (and shouldn't be overridden). Instead, the contents of args should be a list of command-line arguments to be called with srun (or mpirun on a laptop) and an appropriate set of resources. Running the model is an obvious example of this, as we discuss next.
  2. Adds a ModelStep class for reducing the amount of redundancy involved in making steps that run the MPAS model. This class descends from Step and takes care of work like creating graph and partition files, updating PIO namelist options, etc. that are needed before running the model. It also automatically creates the args attribute used to run the model itself.
  3. Migrates many (but by no means all) of the ocean test groups to using ModelStep

Migrated test groups:

  • baroclinic_channel
  • drying_slope
  • global_convergence
  • gotm
  • hurricane
  • internal_waves
  • planar_convergence
  • soma
  • sphere_transport
  • spherical_harmonic_transform
  • ziso

Not migrated (typically because of the complexity of SSH adjustment for ice-shelf cavities):

  • global_ocean
  • ice_shelf_2d
  • isomip_plus

@pep8speaks
Copy link

pep8speaks commented Aug 17, 2022

Hello @xylar! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-08-30 13:54:23 UTC

@xylar xylar added enhancement New feature or request ocean in progress This PR is not ready for review or merging framework labels Aug 17, 2022
@xylar xylar force-pushed the add_commandline_steps branch from c617478 to f642129 Compare August 17, 2022 12:46
@xylar xylar self-assigned this Aug 17, 2022
@xylar
Copy link
Collaborator Author

xylar commented Aug 17, 2022

Testing

The pr suite is bit-for-bit with master using this branch on Anvil with Intel and Intel-MPI.

Further testing is needed to make sure the following test groups that are not in pr are still working:
Migrated test groups:

  • drying_slope
  • gotm
  • hurricane
  • internal_waves
  • planar_convergence
  • sphere_transport
  • spherical_harmonic_transform

@xylar xylar force-pushed the add_commandline_steps branch from f642129 to 4311d89 Compare August 30, 2022 13:50
xylar and others added 13 commits August 30, 2022 15:53
This class defines a step for running the MPAS model as part of
a test case.

Eventually, each step that currently calls `run_model()` should
instead descend from `ModelStep`
Also separate initial_state into initial_state and mesh
This involves breaking the `initial_state` step into 4 steps:
* convert base mesh
* compute initial condition (including land mask) on base mesh
* cull mesh
* compute initial condition on culled mesh
@xylar xylar force-pushed the add_commandline_steps branch from 4311d89 to 5c66ffc Compare August 30, 2022 13:54
@xylar xylar mentioned this pull request Dec 13, 2022
@xylar
Copy link
Collaborator Author

xylar commented Jan 6, 2023

I'm going to close this PR and move this work to Polaris once we have a repo for it.

@xylar xylar closed this Jan 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request framework in progress This PR is not ready for review or merging ocean
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants