forked from E3SM-Project/E3SM
-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add a new MachEnv class to hold various machine and messaging paramet…
…ers (#26) add MachEnv code for MPI and other machine environment variables - includes source code and a unit test - includes sections for User and Developer guides
- Loading branch information
1 parent
31af54c
commit 6004902
Showing
6 changed files
with
1,228 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
(omega-dev-mach-env)= | ||
|
||
## Machine Environment (MachEnv) | ||
|
||
The MachEnv class maintains a number of parameters associated with | ||
the parallel machine environment. These include message-passing parameters | ||
like MPI communicators and task information, the number of threads | ||
if threaded, vector length for CPUs, and GPU and node information as | ||
needed. Multiple environments can be defined to support running portions | ||
of the model on subsets of tasks. | ||
|
||
On model initiation, the initialization routine `OMEGA::MachEnv::init()` | ||
is called to set up the default MachEnv, which can be retrieved at | ||
any time using: | ||
```c++ | ||
OMEGA::MachEnv DefEnv = OMEGA::MachEnv::getDefaultEnv(); | ||
``` | ||
that returns a pointer to the default environment. | ||
Once an environment has been retrieved, the individual data members | ||
are retrieved using various get functions: | ||
```c++ | ||
MPI_Comm Comm = DefEnv->getComm(); | ||
int MyTask = DefEnv->getMyTask(); | ||
int NumTasks = DefEnv->getNumTasks(); | ||
int MasterTask = DefEnv->getMasterTask(); | ||
``` | ||
There is also a logical function `isMasterTask` that can be used | ||
for work that should only be done on the master task. By default, | ||
the master task is defined as task 0 in the environment, but if | ||
the master task is overloaded, there is a `setMasterTask` that can | ||
redefine any other task in the group as the master. | ||
|
||
If OMEGA has been built with OpenMP threading, a `getNumThreads` | ||
function is available; it returns 1 if threading is not on. | ||
The MachEnv also has a public parameter `OMEGA::VecLength` that can | ||
be used to tune the vector length for CPU architectures. For | ||
GPU builds, this VecLength is set to 1. | ||
|
||
As noted previously, additional environments can be defined for | ||
subsets of a parent environment. There are three constructor | ||
interfaces for creating an environment: | ||
```c++ | ||
OMEGA::MachEnv MyNewEnv(Name, ParentEnv, NewSize); | ||
OMEGA::MachEnv MyNewEnv(Name, ParentEnv, NewSize, Begin, Stride); | ||
OMEGA::MachEnv MyNewEnv(Name, ParentEnv, NewSize, Tasks); | ||
``` | ||
An optional additional argument exists for all three that can supply an | ||
alternative task to use as the Master Task for message passing. If not | ||
provided, the master task defaults to 0. The first interface above | ||
creates a new environment from the first `NewSize` contiguous | ||
tasks in the parent machine environment. The second creates a new | ||
environment from a strided set of tasks starting at the `Begin` Task | ||
(eg all odd tasks would have Begin=1 and Stride=2). The final | ||
interface creates a subset containing the selected tasks stored in | ||
a vector `Tasks`. Each new environment is given a `Name` and can be | ||
retrieved by name using `OMEGA::MachEnv::getEnv(Name)`. Once retrieved, | ||
the other get functions listed above are used to get each data member. | ||
Because the new environments are subsets of the default environment, one | ||
additional function `OMEGA::MachEnv::isMember()` is provided so that | ||
non-member tasks can be excluded from calculations. The retrieval functions | ||
call from non-member tasks will return invalid values. Finally, there is | ||
a `removeEnv` function that can delete any defined environment. | ||
As a class that is basically a container for the environment parameters, | ||
the implementation is a simple class with several scalar data members and | ||
the retrieval/creation functions noted above. To track all defined | ||
environments, a c++ map container is used to pair a name with each | ||
defined environment. | ||
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
(omega-user-mach-env)= | ||
|
||
## Machine Environment (MachEnv) | ||
|
||
Within OMEGA, many aspects of the machine environment are stored | ||
in a class called MachEnv. These include message-passing parameters | ||
like MPI communicators and task information, number of threads | ||
if threaded, vector length for CPUs, GPU and node information as | ||
needed. Multiple environments are supported in case portions of the | ||
code need to run on subsets of tasks or in different contexts. A | ||
default environment is defined early in the initialization of the | ||
model and can be retrieved as described in the | ||
[Developer's Guide](#omega-dev-mach-env). | ||
|
||
The user is not expected to set any parameters in MachEnv. All | ||
quantities are derived from either the job launch command | ||
(eg mpirun or srun) that defines the number of MPI tasks and tasks | ||
layouts or from machine parameters enforced during the build based | ||
on supported machine xml configurations. | ||
The latter include the pre-processing parameters | ||
`-DOMEGA_VECTOR_LENGTH=xx` and `-DOMEGA_THREADED` that define an | ||
optimal vector length for CPU code and turn on OpenMP threading | ||
if desired. | ||
|
Oops, something went wrong.