diff --git a/components/omega/doc/design/IO.md b/components/omega/doc/design/IO.md new file mode 100644 index 000000000000..c8cd49bdced8 --- /dev/null +++ b/components/omega/doc/design/IO.md @@ -0,0 +1,367 @@ +(omega-design-IO)= + +# Input/Output (IO) + +## 1 Overview + +The OMEGA model must be able to read and write data to a filesystem. +For performance at high resolution, much of this IO must occur in parallel +and interact with a high-performance filesystem. We describe here an IO +layer that provides interfaces to the underlying SCORPIO parallel I/O +library used in E3SM. It primarily provides a translation layer to +read/write OMEGA metadata and YAKL arrays. It works together with the +OMEGA Metadata and IOStreams capabilities. Users will interact with +IO primarily through the IOStreams and should not need to access this +layer directly. + +## 2 Requirements + +### 2.1 Requirement: Read fields and metadata + +The model must be able to read desired data and metadata from +an input file into the internal model storage/decomposition. +Metadata requirements are described in the Metadata design document. + +### 2.2 Requirement: Write fields and metadata + +The model must be able to write desired data and metadata +to an output file for later use. Metadata requirements are +described in the Metadata design document. + +### 2.3 Requirement: Self-describing formats + +All files must (eventually) be in a self-describing +format like HDF5 or netCDF. Note that alternative formats +can be supported as long as scripts are provided to convert +to netCDF/HDF5 efficiently as part of model run scripts. +SCORPIO supports a number of underlying formats and the +user must be able to select the appropriate format (eg +multiple netcdf formats and ADIOS format). + +### 2.4 Requirement: multiple files/streams + +The model must support reading from and writing to an +arbitrary number of files with different properties +(eg precision, time frequencies, contents). There will +be a Streams capability described in a separate +design document that will manage many of these properties, +but the underlying I/O layer here must be able to +support multiple open files, each with unique +properties. + +### 2.5 Requirement: parallel I/O + +Performance at high resolution requires a parallel I/O +solution in which multiple processors can be writing/reading +data to maximize bandwidth to the filesystem and minimize +time spent in I/O. + +### 2.6 Requirement: parallel I/O tuning + +Some properties of the parallel I/O must be configurable +to tune the parallel I/O for performance on a particular +architecture and filesystem. At a minimum, the user must +be able to specify the number of I/O tasks and the stride +of those tasks. + +### 2.7 Requirement: Data types and type conversion + +The I/O system must be able to read/write all supported +data types for metadata and all supported YAKL array types. +In some cases, output files at reduced precision are required +so an option to convert data to reduced precision is needed. + +### 2.8 Requirement: YAKL arrays and host/device support + +Distributed data in OMEGA is stored as YAKL array types. +We must be able to read/write YAKL arrays and be able +to move data between host and device as needed. + +### 2.9 Requirement: Modes on file existence + +If an output file exists, the user must be able to specify +whether the model should overwrite the existing file, exit +with an error message or append data to the existing file. +On input, if a file does not exist, the model must exit with +an appropriate error code. + +### 2.10 Desired: Asynchronous I/O + +For performance, it would be desireable to enable the model +to launch I/O tasks and resume computation while the actual +writing to file takes place. This will mask I/O costs by +overlapping I/O functions with model computations. + +### 2.11 Desired: File compression + +It may be desireable to apply data compression while +performing I/O to minimize data file sizes. This compression +may be lossless or not depending on the use for each +file. + + +## 3 Algorithmic Formulation + +Parallel I/O systems use various algorithms for rearranging +data in parallel decompositions to the decomposition required +for the I/O processor layout. These algorithms are described +in the I/O library (scorpio) documentation and related +publications. + +## 4 Design + +The OMEGA model I/O will be built on top of the SCORPIO +parallel I/O library used across E3SM components. The +I/O interfaces here generally provide wrappers for +translating internal OMEGA metadata representations and +YAKL array types to the form required by SCORPIO. OMEGA +users and developers will generally interact with I/O +through the IO Streams layer that manages all files and +associated file contents. + +### 4.1 Data types and parameters + +#### 4.1.1 Parameters + +There will be a section in the input OMEGA configuration file +for managing overall parallel IO options. Currently, this +will include three variables: + +```yaml + IO: + ioTasks: 1 + ioStride: 1 + ioRearranger: box +``` + +The default for IO tasks and stride will be 1 for safety, but +should always be changed to appropriate values for a particular +simulation and machine architecture. For example, the user +should set the stride such that the number of cores performing +I/O is limited to the optimal number for a given multi-processor +node. Scorpio supports a few rearranger methods for moving the +data to IO tasks (see Scorpio documentation) but the box +rearranger is often preferred and is the default. The rearranger +is ignored for the ADIOS back-end since it writes separate files +from each task and rearranges the data in a post-processing step. +These input variables correspond to variables of the +same name. + +We define a number of enums to support various options, including +the rearranger method, file format, operation (read/write) and +behavior when file exists: + +```c++ + enum class IORearranger { + box, /// box rearranger (default) + subset, /// subset rearranger + unknown /// unknown or undefined + }; + + enum class IOFileFormat { + undefined, + default, + netcdf, + pnetcdf, + netcdf4, + pnetcdf5, + hdf5, + adios, + }; + + enum class IOMode { + read, + write, + }; + + enum class IOIfExists { + overwrite, + fail, + }; + + enum class IOPrecision { + single, + double, + }; +``` + + +#### 4.1.2 Class/structs/data types + +The PIO routines require some information about the decomposition, +so there will be one class containing environment information including +array decompositions for each data type, mesh location and array size. + +```c++ + class IOEnv { + + private: + + /// track and store all defined environments internally + static std::vector> definedEnvs; + + /// name for this environment + std::string name; + + /// default file format and rearranger method + IOFormat format; + IORearranger rearranger; + + /// ptr to defined Scorpio IO system with MPI communicator info + int* iosystem; + + /// pointers to defined decompositions for every multi-dim + /// array, data type and cell location + int* decompCell1DI4; + int* decompCell1DR4; + int* decompCell1DR8; + int* decompCell2DI4; + int* decompCell2DR4; + int* decompCell2DR8; + // continue for dimensions up to 5 and edge, vertex + // mesh locations + // note that Scorpio does not support logical arrays + // or I8 types so these will be converted to an + // an appropriate type during read/write + + // Methods below will be private and only accessed + // via the OMEGA IOstreams interfaces with IOstreams + // as a friend class + + public: + // Users will not access methods or vars directly + // but only through the IOstreams interfaces + + friend class IOStreams; + } +``` + +There will also be an IOField class to combine the Metadata +and a pointer to the array holding data. This will be used +to specify the fields in an IOStream. + +```c++ + class IOField { + + private: + + /// track and store all defined fields internally + static std::vector> definedFields; + + std::shared_ptr metaPtr; + + /// only one of the pointers will be defined based on array type + std::shared_ptr data1DI4; + [replicated up to 5D arrays of I4,I8,R4,R8] + std::shared_ptr dataHost1DI4; + [replicated up to 5D arrays of I4,I8,R4,R8] + + + public: + // See methods below + + friend class IOStreams; + } +``` + + +### 4.2 Methods + +As noted above, these methods are actually private and accessed +only through the IOStreams interfaces. + +#### 4.2.1 Environment constructor + +There will be a constructor that initializes the IO system +and decompositions. Note that the defaults for file format +and rearranger can be overridden on a per-file basis. + +```c++ + IOEnv(const std::string name, /// name for this env + const MachEnv omegaEnv, /// machine env with MPI info + const Decomp omegaDecomp, /// mesh decomposition + const int ioTasks, /// number of IO tasks + const int ioStride, /// stride in MPI ranks for io tasks + const IOFormat format, /// default IO format + const IORearranger rearranger, /// default rearranger method + ); +``` + +#### 4.2.2 File open/close + +There will be interfaces for opening and closing files for reading and writing. +For the file open, a file id will be returned. + +```c++ + /// Opens a file for reading or writing, returns an error code. + /// If the IOFileFormat argument is `default`, the default + /// format from IOEnv will be used. `ifexists` will be ignored for + /// reads. + int IOFileOpen(int& fileID, /// returned fileID for this file + const std::string filename, /// name of file to open + const IOEnv& myEnv, /// overall IO environment + const IOMode mode, /// mode (read or write) + const IOPrecision precision, /// precision of floats + const IOIfExists ifexists, /// behavior if file exists + const IOFileFormat format, /// file format + ); + + /// closes an open file using the fileID, returns an error code + int IOFileClose(int& fileID /// ID of the file to be closed + ); + +``` +#### 4.2.3 Write operations + +Because the IOStreams and metadata manager have aggregated all +metadata, dims and pointers to data arrays, a file can be written +with a single call using the aggregated data from the IOStream: + +```c++ + /// writes all data associated with the file + int IOWrite(const int fileID, /// id for the open file + const std::vector> contents); +``` + +#### 4.2.4 Read operations + +Unlike the write function, not all data within a file may need to +be read, so we read each field individually: + +```c++ + /// reads a field the file + int IORead(const int fileID, /// id for the open file + std::shared_ptr field); +``` + +#### 4.2.5 Defining IO fields + +Each field meant to be read or written should be defined, typically +at the same time a module defines the associated metadata. The IOField +is then constructed first with a pointer to the Metadata: + +```c++ + /// IOField constructor + IOField(std::shared_ptr fieldMeta); +``` + +Then a pointer to the array is attached using an attach function +(aliased to the various array types): + +```c++ + /// IOField attach data + IOField::attachData(std::shared_ptr); + [replicate generic interface for all supported array types] +``` + + +## 5 Verification and Testing + +### 5.1 Test via IOStreams + +Because the functions here are only accessed via the IOStreams interfaces, +the testing of these routines will be part of the IOStreams unit test. + - tests requirement 2.1-2.9 + + + diff --git a/components/omega/doc/design/IOStreams.md b/components/omega/doc/design/IOStreams.md new file mode 100644 index 000000000000..99aa90132ebf --- /dev/null +++ b/components/omega/doc/design/IOStreams.md @@ -0,0 +1,342 @@ + +(omega-design-IOStreams)= + +# IOStreams + +## 1 Overview + +OMEGA must be able to read and write data throughout a simulation. +Data can be read or written either once or periodically. Files +will contain different fields to read/write. We consider each file +or file sequence an IO stream with its own unique set of properties. +We describe here the requirements and design of these IO streams. +The design relies on companion designs for managing metadata and a +lower-level IO functions for writing metadata and data stored in +YAKL arrays. + +## 2 Requirements + +### 2.1 Requirement: Multiple streams + +An arbitrary number of input and output streams, each with its own +properties and precision, must be supported. + +### 2.2 Requirement: Reduced precision + +For some output, it is not necessary to retain the full precision +that the model supports. Reduced precision for each stream must +be supported to reduce file size when full precision is not +required. + +### 2.3 Requirement: Contents for each stream + +Users must be able to supply a list of fields to be included in +each stream. This contents list should be checked against +available fields (from OMEGA defined metadata and metadata groups) +and exit if a field is not available. + +### 2.4 Requirement: Exact restart + +There must be one stream for checkpointing and restarting and +must include all fields needed to exactly restart the model, +maintaining bit-for-bit when compared to a non-restarted simulation +over the same time interval. This also generally requires full +precision for all fields. + +### 2.5 Requirement: Restart pointer + +To avoid the need to modify configuration files with a new +input filename on every restart, a mechanism for determining +the last successful restart is needed. This is often implemented +by writing the name of the last successful restart file to a +standard location (pointer file) so that the model can read the +name of the last restart and continue the simulation. + +### 2.6 Requirement: Time intervals + +The user must be able to specify a time interval for any +repeating input and output streams (eg monthly output). +The time interval can be any interval supported by the OMEGA +time manager. + +### 2.7 Requirement: Optional start/stop times + +For some streams, we may wish to supply a start and stop +time to sample the simulation only over a particular time +period. For example, requesting high-frequency output over a +short interval. The user must be able to optionally request +a start and stop time for the stream. + +### 2.8 Requirement: Filenames + +The user must be able to supply a filename associated with the +stream. For repeating I/O, the filename will be a template +that will specify how the filename will be modified to reflect +the time associated with the output. The template should +be able to support any conventions (eg E3SM) required. Filenames +should include the full path. + +### 2.9 Requirement: Multiple time slices per file + +Streams must be able to support multiple time instances for +the requested fields in a single file. For example, monthly +forcing inputs may be provided in a single input file. Similarly, +for higher-frequency output, it may be desirable to include +several intervals in a single file. An appropriate filename +convention/template must also be provided for this situation. + +### 2.10 Requirement: Behavior on file existence + +To support multiple time slices per file and to support cases +where the model is re-run over the same interval (eg during +testing or repeating a failed run), the user must specify the +desired behavior when an output file of the same name already +exists. A minimum set of options should include over-writing +(replacing) the file, appending to the existing file, or exiting +with an error message. + +### 2.11 Desired: Time averaging + +In some cases, it is desirable to accumulate the time average of +selected fields during the simulation in order to sample every +timestep (or other frequency higher than the output interval). While +the time-averaging capability itself will be implemented elsewhere +in the code, time-averaging over intervals longer than the typical +checkpoint/restart interval would require additional restart +capabilities to maintain these time averages and a filename +template should describe the averaging interval in some format. + +### 2.12 Desired: Data compression + +In the future, it may be possible to apply lossy or loss-less +compression on data to save space. An option to compress +data would be desirable if it can be supported. Like +reduced-precision, this option would probably be on a per-stream +basis. + +## 3 Algorithmic Formulation + +No algorithms are needed. Most functions carried out by other +modules and libraries. + +## 4 Design + +IOStreams will be the primary mechanism used for all input and +output. The streams will be defined in a streams section of the input +Omega configuration file. Because YAML uses some common characters +to denote lists, etc., filenames should be in quotes. For example: + +```yaml + IOStreams: + + mesh: + mode: read + name: '/mypath/meshFileName' + freqUnits: initial + freq: 1 + precision: double + contents: + - GRPmesh + - [other fields?] + + restart-read: + mode: read + name: '/mypath/omega-restart.$Y-$M-$D_$h.$m.$s.nc' + pointerFilename: '/mypath/pointerFileName' + freqUnits: initial + freq: 1 + precision: double + contents: + - GRPrestart + - [other fields or groups as needed] + + restart-write: + mode: write + name: '/mypath/omega-restart.$Y-$M-$D_$h.$m.$s.nc' + pointerFilename: '/mypath/pointerFileName' + freqUnits: years + freq: 1 + precision: double + ifExists: overwrite + contents: + - GRPrestart + - [other fields or groups as needed] + + history: + mode: write + name: '/mypath/omega-history.$Y-$M-$D_$h.$m.$s.nc' + freqUnits: months + freq: 1 + precision: single + ifExists: overwrite + contents: + - temperature + - salinity + - normalVelocity + - [other fields or groups as needed] + + # 10-day high-freq data stored in an annual file + highFreq: + mode: write + name: '/mypath/omega-highfreq.$Y-$M.nc' + freqUnits: days + freq: 10 + precision: single + ifExists: append + startDate: yyyy-mm-dd + endDate: yyyy-mm-dd + contents: + - temperature + - salinity + - normalVelocity + - [other fields or groups as needed] +``` + +Note that there is a stdlib iostreams so our use of IOStreams runs +the risk of inadvertent conflict. Use of both the Omega namespace and +the capitalized IOStreams will be used to distinguish the two. + +### 4.1 Data types and parameters + + +#### 4.1.1 Parameters + +There are no parameters specific to the IOStreams, though they will +utilize some parameters (like precision and other options) from a +lower-level IO module. + +#### 4.1.2 Class/structs/data types + +The main class is an IOStream that carries all the information +related to each input or output stream. Like other classes, all +instantiations of streams are managed within the class. + +```c++ + class IOStream { + + private: + std::string name; /// name of stream + std::string filename; /// filename or template + + IOMode mode; /// mode (read or write) + IOPrecision precision; /// precision for floating point vars + Alarm sAlarm; /// time mgr alarm for read/write + + bool usePointer; /// flag for using a pointer file + std::string ptrFilename; /// name of pointer file + + bool useStartEnd; /// flag for using start, end times + TimeInstant startTime; /// start time for stream + TimeInstant endTime; /// end time for stream + + /// Contents of stream in the form of a vector of + /// pointers to defined IOFields + std::vector> contents; + + /// Store and maintain all defined streams in this vector + static int numStreams; + static std::vector> allStreams; + + public: + [see methods described below] + }; +``` + +### 4.2 Methods + +### 4.2.1 Create/Destroy + +There will be a constructor that creates a stream and fills the +scalar variables. And a destructor that deletes the stream and removes +it from the list of defined streams. + +```c++ + IOStream(int& streamID, /// id of created stream + const std::string name, /// name of stream + const std::string filename, /// file name or template + const IOmode mode, /// read/write mode + const IOPrecision precision, /// precision for floats + const IOIfExists ifExists, /// action if file exists + const std::string freqUnits, /// time frequency for I/O + const int freq, /// freq in above units for I/O + const std::string pointerFile, /// pointer filename if used + const std::string startDate, /// optional start date string + const std::string endDate. /// optional end date string + ); + + ~IOStream(); +``` + +### 4.2.2 Add fields + +Once an IOStream is created, the contents must be added using one of +two interfaces. If the streamId is still available, the first interface +can be used and is faster (the streamID is the index into the vector +of defined streams). Otherwise, an interface that takes the +stream name can be used. An error code is returned. + +```c++ + int IOStreams::AddField( + const int streamID, /// id of stream to be modified + const std::string fieldName /// name of field + ); + + int IOStreams::AddField( + const std::string streamName, /// name of stream + const std::string fieldName /// name of field + ); +``` + +### 4.2.3 Write streams + +Generally, we want to write all streams at the end of the timestep +if it's time to write, so there will be one interface to simply +check the alarms for all streams and write the data if it's time. +After writing, the alarm will be reset. + +```c++ + int IOStreams::WriteAll(); +``` + +However, if the user needs to write a stream from elsewhere in +the code, there will be an interface to write a single stream +by either streamID or by stream name after checking the alarm to +see if it's time to write. The alarm will be reset once the writing +is complete. + +```c++ + int IOStreams::Write(const int streamID, /// id of stream to be modified + ); + + int IOStreams::Write(const std::string streamName, /// name of stream + ); +``` + +### 4.2.3 Read streams + +Unlike writing, each input stream is typically read from the +appropriate module, so only single-stream reads are provided and +with only the name interface since the id may no longer be +accessible by the calling routine. + +```c++ + int IOStreams::Read(const std::string streamName, /// name of stream + ); + +``` + + +## 5 Verification and Testing + +### 5.1 Test All + +A test driver and configuration file will create a number of streams +with fields, frequencies and other options that attempt to span all +possible configurations (though may not be able to test all file +formats). This driver will march in time to write streams at the +requested frequencies. At the end of the driver, input streams that +mirror the output streams will be read and the fields compared to +determine if they match. + - tests requirements 2.1-2.9 + +