Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new MachEnv class to hold various machine and messaging parameters #26

Merged
merged 7 commits into from
Aug 2, 2023
69 changes: 69 additions & 0 deletions components/omega/doc/devGuide/MachEnv.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
(omega-dev-mach-env)=

## Machine Environment (MachEnv)

The MachEnv class maintains a number of parameters associated with
the parallel machine environment. These include message-passing parameters
like MPI communicators and task information, the number of threads
if threaded, vector length for CPUs, and GPU and node information as
needed. Multiple environments can be defined to support running portions
of the model on subsets of tasks.

On model initiation, the initialization routine `OMEGA::MachEnv::init()`
is called to set up the default MachEnv, which can be retrieved at
any time using:
```c++
OMEGA::MachEnv DefEnv = OMEGA::MachEnv::getDefaultEnv();
```
that returns a pointer to the default environment.
Once an environment has been retrieved, the individual data members
are retrieved using various get functions:
```c++
MPI_Comm Comm = DefEnv->getComm();
int MyTask = DefEnv->getMyTask();
int NumTasks = DefEnv->getNumTasks();
int MasterTask = DefEnv->getMasterTask();
```
There is also a logical function `isMasterTask` that can be used
for work that should only be done on the master task. By default,
the master task is defined as task 0 in the environment, but if
the master task is overloaded, there is a `setMasterTask` that can
redefine any other task in the group as the master.

If OMEGA has been built with OpenMP threading, a `getNumThreads`
function is available; it returns 1 if threading is not on.
The MachEnv also has a public parameter `OMEGA::VecLength` that can
be used to tune the vector length for CPU architectures. For
GPU builds, this VecLength is set to 1.

As noted previously, additional environments can be defined for
subsets of a parent environment. There are three interfaces for
creating an environment:
```c++
OMEGA::MachEnv MyNewEnv =
OMEGA::MachEnv::createNewEnv(Name, ParentEnv, NewSize);
OMEGA::MachEnv MyNewEnv =
OMEGA::MachEnv::createNewEnv(Name, ParentEnv, NewSize, Begin, Stride);
OMEGA::MachEnv MyNewEnv =
OMEGA::MachEnv::createNewEnv(Name, ParentEnv, NewSize, Tasks);
```
The first creates a new environment from the first `NewSize` contiguous
tasks in the parent machine environment. The second creates a new
environment from a strided set of tasks starting at the `Begin` Task
(eg all odd tasks would have Begin=1 and Stride=2). The final
interface creates a subset containing the selected tasks stored in
a vector `Tasks`. Each new environment is given a `Name` and can be
retrieved by name using `OMEGA::MachEnv::getEnv(Name)`. Once retrieved,
the other get functions listed above are used to get each data member.
Because the new environments are subsets of the default environment, one
additional function `OMEGA::MachEnv::isMember()` is provided so that
non-member tasks can be excluded from calculations. The retrieval functions
call from non-member tasks will return invalid values. Finally, there is
a `removeEnv` function that can delete any defined environment.

As a class that is basically a container for the environment parameters,
the implementation is a simple class with several scalar data members and
the retrieval/creation functions noted above. To track all defined
environments, a c++ map container is used to pair a name with a pointer to
a defined environment.

2 changes: 2 additions & 0 deletions components/omega/doc/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ Development is taking place at https://github.com/E3SM-Project/Omega.
userGuide/QuickStart
userGuide/OmegaBuild
userGuide/DataTypes
userGuide/MachEnv
```

```{toctree}
Expand All @@ -31,6 +32,7 @@ userGuide/DataTypes

devGuide/QuickStart
devGuide/DataTypes
devGuide/MachEnv
devGuide/CondaEnv
devGuide/Docs
devGuide/BuildDocs
Expand Down
24 changes: 24 additions & 0 deletions components/omega/doc/userGuide/MachEnv.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
(omega-user-mach-env)=

## Machine Environment (MachEnv)

Within OMEGA, many aspects of the machine environment are stored
in a class called MachEnv. These include message-passing parameters
like MPI communicators and task information, number of threads
if threaded, vector length for CPUs, GPU and node information as
needed. Multiple environments are supported in case portions of the
code need to run on subsets of tasks or in different contexts. A
default environment is defined early in the initialization of the
model and can be retrieved as described in the
[Developer's Guide](#omega-dev-mach-env).

The user is not expected to set any parameters in MachEnv. All
quantities are derived from either the job launch command
(eg mpirun or srun) that defines the number of MPI tasks and tasks
layouts or from machine parameters enforced during the build based
on supported machine xml configurations.
The latter include the pre-processing parameters
`-DOMEGA_VECTOR_LENGTH=xx` and `-DOMEGA_THREADED` that define an
optimal vector length for CPU code and turn on OpenMP threading
if desired.

Loading