Skip to content

Commit

Permalink
Add a new MachEnv class to hold various machine and messaging paramet…
Browse files Browse the repository at this point in the history
…ers (#26)

add MachEnv code for MPI and other machine environment variables

  - includes source code and a unit test
  - includes sections for User and Developer guides
  • Loading branch information
philipwjones authored Aug 2, 2023
1 parent 31af54c commit 6004902
Show file tree
Hide file tree
Showing 6 changed files with 1,228 additions and 0 deletions.
69 changes: 69 additions & 0 deletions components/omega/doc/devGuide/MachEnv.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
(omega-dev-mach-env)=

## Machine Environment (MachEnv)

The MachEnv class maintains a number of parameters associated with
the parallel machine environment. These include message-passing parameters
like MPI communicators and task information, the number of threads
if threaded, vector length for CPUs, and GPU and node information as
needed. Multiple environments can be defined to support running portions
of the model on subsets of tasks.

On model initiation, the initialization routine `OMEGA::MachEnv::init()`
is called to set up the default MachEnv, which can be retrieved at
any time using:
```c++
OMEGA::MachEnv DefEnv = OMEGA::MachEnv::getDefaultEnv();
```
that returns a pointer to the default environment.
Once an environment has been retrieved, the individual data members
are retrieved using various get functions:
```c++
MPI_Comm Comm = DefEnv->getComm();
int MyTask = DefEnv->getMyTask();
int NumTasks = DefEnv->getNumTasks();
int MasterTask = DefEnv->getMasterTask();
```
There is also a logical function `isMasterTask` that can be used
for work that should only be done on the master task. By default,
the master task is defined as task 0 in the environment, but if
the master task is overloaded, there is a `setMasterTask` that can
redefine any other task in the group as the master.

If OMEGA has been built with OpenMP threading, a `getNumThreads`
function is available; it returns 1 if threading is not on.
The MachEnv also has a public parameter `OMEGA::VecLength` that can
be used to tune the vector length for CPU architectures. For
GPU builds, this VecLength is set to 1.

As noted previously, additional environments can be defined for
subsets of a parent environment. There are three constructor
interfaces for creating an environment:
```c++
OMEGA::MachEnv MyNewEnv(Name, ParentEnv, NewSize);
OMEGA::MachEnv MyNewEnv(Name, ParentEnv, NewSize, Begin, Stride);
OMEGA::MachEnv MyNewEnv(Name, ParentEnv, NewSize, Tasks);
```
An optional additional argument exists for all three that can supply an
alternative task to use as the Master Task for message passing. If not
provided, the master task defaults to 0. The first interface above
creates a new environment from the first `NewSize` contiguous
tasks in the parent machine environment. The second creates a new
environment from a strided set of tasks starting at the `Begin` Task
(eg all odd tasks would have Begin=1 and Stride=2). The final
interface creates a subset containing the selected tasks stored in
a vector `Tasks`. Each new environment is given a `Name` and can be
retrieved by name using `OMEGA::MachEnv::getEnv(Name)`. Once retrieved,
the other get functions listed above are used to get each data member.
Because the new environments are subsets of the default environment, one
additional function `OMEGA::MachEnv::isMember()` is provided so that
non-member tasks can be excluded from calculations. The retrieval functions
call from non-member tasks will return invalid values. Finally, there is
a `removeEnv` function that can delete any defined environment.
As a class that is basically a container for the environment parameters,
the implementation is a simple class with several scalar data members and
the retrieval/creation functions noted above. To track all defined
environments, a c++ map container is used to pair a name with each
defined environment.
2 changes: 2 additions & 0 deletions components/omega/doc/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ Development is taking place at https://github.com/E3SM-Project/Omega.
userGuide/QuickStart
userGuide/OmegaBuild
userGuide/DataTypes
userGuide/MachEnv
```

```{toctree}
Expand All @@ -31,6 +32,7 @@ userGuide/DataTypes
devGuide/QuickStart
devGuide/DataTypes
devGuide/MachEnv
devGuide/CondaEnv
devGuide/Docs
devGuide/BuildDocs
Expand Down
24 changes: 24 additions & 0 deletions components/omega/doc/userGuide/MachEnv.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
(omega-user-mach-env)=

## Machine Environment (MachEnv)

Within OMEGA, many aspects of the machine environment are stored
in a class called MachEnv. These include message-passing parameters
like MPI communicators and task information, number of threads
if threaded, vector length for CPUs, GPU and node information as
needed. Multiple environments are supported in case portions of the
code need to run on subsets of tasks or in different contexts. A
default environment is defined early in the initialization of the
model and can be retrieved as described in the
[Developer's Guide](#omega-dev-mach-env).

The user is not expected to set any parameters in MachEnv. All
quantities are derived from either the job launch command
(eg mpirun or srun) that defines the number of MPI tasks and tasks
layouts or from machine parameters enforced during the build based
on supported machine xml configurations.
The latter include the pre-processing parameters
`-DOMEGA_VECTOR_LENGTH=xx` and `-DOMEGA_THREADED` that define an
optimal vector length for CPU code and turn on OpenMP threading
if desired.

Loading

0 comments on commit 6004902

Please sign in to comment.