Merge pull request #2005 from jedwards4b/external_process_interface

External process interface Improve interface to external scripts PRERUN_SCRIPT, POSTRUN_SCRIPT, DATA_ASSIMILATION_SCRIPT. If DATA_ASSIMILATION is true but no script is named throw an error, allow scripts in python to be called directly instead of through shell. Improve handling of arguments and log files. Test suite: scripts_regression_tests.py, hand testing Test baseline: Test namelist changes: Test status: bit for bit Fixes #1953 User interface changes?: Update gh-pages html (Y/N)?: Y Code review:
ESMCI · Nov 6, 2017 · 0027c84 · 0027c84
2 parents a84b875 + 5b57163
commit 0027c84
Show file tree

Hide file tree

Showing 5 changed files with 132 additions and 69 deletions.
diff --git a/doc/source/users_guide/running-a-case.rst b/doc/source/users_guide/running-a-case.rst
@@ -12,16 +12,16 @@ Calling **case.submit**
 
 Before you submit the case using **case.submit**, make sure
 the batch queue variables are set correctly for your run
-Those variables are contained in the file **$CASEROOT/env_batch.xml** 
-under the XML ``<group id="case.run">`` and ``<group id="case.st_archive">`` 
-elements. 
+Those variables are contained in the file **$CASEROOT/env_batch.xml**
+under the XML ``<group id="case.run">`` and ``<group id="case.st_archive">``
+elements.
 
 Make sure that you have appropriate account numbers (``PROJECT``), time limits
 (``JOB_WALLCLOCK_TIME``), and queue (``JOB_QUEUE``) for those groups.
 
 Also modify **$CASEROOT/env_run.xml** for your case using :ref:`xmlchange<modifying-an-xml-file>`.
 
-Once you have executed **case.setup** and **case.build**, run **case.submit** 
+Once you have executed **case.setup** and **case.build**, run **case.submit**
 to submit the run to your machine's batch queue system.
 ::
 
@@ -40,7 +40,7 @@ When called, the **case.submit** script will:
 
 - Run **preview_namelist**, which in turn will run each component's **buildnml**.
 
-- Run :ref:`check_input_data<input_data>` to verify that the required 
+- Run :ref:`check_input_data<input_data>` to verify that the required
   data are present.
 
 - Submit the job to the batch queue. which in turn will run the **case.run** script.
@@ -52,7 +52,7 @@ Upon successful completion of the run, **case.run** will:
 
 - Copy log files back to ``$LOGDIR``.
 
-- Submit the short-term archiver script **case.st_archive** 
+- Submit the short-term archiver script **case.st_archive**
   to the batch queue if ``$DOUT_S`` is TRUE.
 
 - Resubmit **case.run** if ``$RESUBMIT`` > 0.
@@ -95,21 +95,21 @@ messages:
 .. note::
   After a successful first run, set the **env_run.xml** variable
   ``$CONTINUE_RUN`` to ``TRUE`` before resubmitting or the job will not
-  progress. 
-  
+  progress.
+
   You may also need to modify the **env_run.xml** variables
   ``$STOP_OPTION``, ``$STOP_N`` and/or ``$STOP_DATE`` as well as
   ``$REST_OPTION``, ``$REST_N`` and/or ``$REST_DATE``, and ``$RESUBMIT``
   before resubmitting.
 
-See :ref:`the basic example<use-cases-basic-example>` for a complete example 
+See :ref:`the basic example<use-cases-basic-example>` for a complete example
 of how to run a case.
 
 ---------------------------------
 Troubleshooting a job that fails
 ---------------------------------
 
-There are several places to look for information if a job fails. 
+There are several places to look for information if a job fails.
 Start with the **STDOUT** and **STDERR** file(s) in **$CASEROOT**.
 If you don't find an obvious error message there, the
 **$RUNDIR/$model.log.$datestamp** files will probably give you a
@@ -126,14 +126,14 @@ problems<troubleshooting>` for more information.
 Input data
 ====================================================
 
-The **check_input_data** script determines if the required data files 
-for your case exist on local disk in the appropriate subdirectory of 
+The **check_input_data** script determines if the required data files
+for your case exist on local disk in the appropriate subdirectory of
 ``$DIN_LOC_ROOT``. It automatically downloads missing data.
 
 The required input data sets needed for each component are found in the
-**$CASEROOT/Buildconf** directory. These files are generated by a call 
-to **preview_namlists** and are in turn created by each component's 
-**buildnml** script. For example, for compsets consisting only of data 
+**$CASEROOT/Buildconf** directory. These files are generated by a call
+to **preview_namlists** and are in turn created by each component's
+**buildnml** script. For example, for compsets consisting only of data
 models (``A`` compsets), the following files are created:
 ::
 
@@ -163,12 +163,12 @@ Controlling starting, stopping and restarting a run
 ====================================================
 
 The file **env_run.xml** contains variables that may be modified at
-initialization or any time during the course of a model run. Among 
-other features, the variables comprise coupler namelist settings for 
-the model stop time, restart frequency, coupler history frequency, and 
+initialization or any time during the course of a model run. Among
+other features, the variables comprise coupler namelist settings for
+the model stop time, restart frequency, coupler history frequency, and
 a flag to determine if the run should be flagged as a continuation run.
 
-At a minimum, you will need to set the variables ``$STOP_OPTION`` and 
+At a minimum, you will need to set the variables ``$STOP_OPTION`` and
 ``$STOP_N``. Other driver namelist settings then will have consistent and
 reasonable default values. The default settings guarantee that
 restart files are produced at the end of the model run.
@@ -203,10 +203,10 @@ The case initialization type is set using the ``$RUN_TYPE`` variable in
 
 ``startup``
   In a startup run (the default), all components are initialized using
-  baseline states. These states are set independently by each component 
-  and can include the use of restart files, initial  files, external 
+  baseline states. These states are set independently by each component
+  and can include the use of restart files, initial  files, external
   observed data files, or internal initialization (that is, a "cold start").
-  In a startup run, the coupler sends the start date to the components 
+  In a startup run, the coupler sends the start date to the components
   at initialization. In addition, the coupler does not need an input data file.
   In a startup initialization, the ocean model does not start until the second
   ocean coupling step.
@@ -231,14 +231,14 @@ The case initialization type is set using the ``$RUN_TYPE`` variable in
   type of run. ``$RUN_REFCASE`` and ``$RUN_REFDATE`` are required for
   branch runs. To set up a branch run, locate the restart tar file or
   restart directory for ``$RUN_REFCASE`` and ``$RUN_REFDATE`` from a
-  previous run, then place those files in the ``$RUNDIR``  directory. 
+  previous run, then place those files in the ``$RUNDIR``  directory.
   See :ref:`setting up a branch
   run<setting-up-a-branch-run>`.
 
 ``hybrid``
   A hybrid run is initialized like a startup but it uses
   initialization data sets from a previous case. It is similar
-  to a branch run with relaxed restart  constraints. 
+  to a branch run with relaxed restart  constraints.
   A hybrid run allows users to bring together
   combinations of initial/restart files from a previous case
   (specified by ``$RUN_REFCASE``) at a given model output date
@@ -259,10 +259,10 @@ run, the ``$CONTINUE_RUN`` variable is set to TRUE, and the model
 restarts exactly using input files in a case, date, and bit-for-bit
 continuous fashion.
 
-The variable ``$RUN_STARTDATE`` is the start date (in yyyy-mm-dd format) 
-for either a startup run or a hybrid run. If the run is targeted to be 
+The variable ``$RUN_STARTDATE`` is the start date (in yyyy-mm-dd format)
+for either a startup run or a hybrid run. If the run is targeted to be
 a hybrid or branch run, you must specify values for ``$RUN_REFCASE`` and
-``$RUN_REFDATE``. 
+``$RUN_REFDATE``.
 
 .. _controlling-output-data:
 
@@ -303,13 +303,13 @@ Also:
 
 - Users generally should turn off short-term archiving when developing new code.
 
-Standard output generated from each component is saved in ``$RUNDIR`` 
-in a  *log file*. Each time the model is run, a single coordinated datestamp 
+Standard output generated from each component is saved in ``$RUNDIR``
+in a  *log file*. Each time the model is run, a single coordinated datestamp
 is incorporated into the filename of each output log file.
 The run script generates the datestamp in the form YYMMDD-hhmmss, indicating
 the year, month, day, hour, minute and second that the run began
-(ocn.log.040526-082714, for example). Log files are copied to a user-specified 
-directory using the variable ``$LOGDIR`` in **env_run.xml**. The default is a "logs" 
+(ocn.log.040526-082714, for example). Log files are copied to a user-specified
+directory using the variable ``$LOGDIR`` in **env_run.xml**. The default is a "logs"
 subdirectory in the **$CASEROOT** directory.
 
 By default, each component also periodically writes history files
@@ -339,23 +339,23 @@ for a description of output data filenames.
 Restarting a run
 ======================
 
-Active components (and some data components) write restart files 
+Active components (and some data components) write restart files
 at intervals that are dictated by the driver via the setting of the
 ``$REST_OPTION`` and ``$REST_N`` variables in **env_run.xml**. Restart
 files allow the model to stop and then start again with bit-for-bit
 exact capability; the model output is exactly the same as if the model
 had not stopped. The driver coordinates the writing of restart
 files as well as the time evolution of the model.
 
-Runs that are initialized as branch or hybrid runs require 
-restart/initial files from previous model runs (as specified by the 
+Runs that are initialized as branch or hybrid runs require
+restart/initial files from previous model runs (as specified by the
 variables ``$RUN_REFCASE`` and ``$RUN_REFDATE``). Pre-stage these
-iles to the case ``$RUNDIR`` (normally ``$EXEROOT/run``) 
-before the model run starts. Normally this is done by copying the contents 
+iles to the case ``$RUNDIR`` (normally ``$EXEROOT/run``)
+before the model run starts. Normally this is done by copying the contents
 of the relevant **$RUN_REFCASE/rest/$RUN_REFDATE.00000** directory.
 
 Whenever a component writes a restart file, it also writes a restart
-pointer file in the format **rpointer.$component**. Upon a restart, each 
+pointer file in the format **rpointer.$component**. Upon a restart, each
 component reads the pointer file to determine which file to read in
 order to continue the run. These are examples of pointer files created
 for a component set using full active model components.
@@ -382,27 +382,27 @@ Backing up to a previous restart
 ---------------------------------
 
 If a run encounters problems and crashes, you will normally have to
-back up to a previous restart. If short-term archiving is enabled, 
+back up to a previous restart. If short-term archiving is enabled,
 find the latest **$DOUT_S_ROOT/rest/yyyy-mm-dd-ssss/** directory
 and copy its contents into your run directory (``$RUNDIR``).
 
-Make sure that the new restart pointer files overwrite older files in 
-in ``$RUNDIR`` or the job may not restart in the correct place. You can 
+Make sure that the new restart pointer files overwrite older files in
+in ``$RUNDIR`` or the job may not restart in the correct place. You can
 then continue the run using the new restarts.
 
 Occasionally, when a run has problems restarting, it is because the
-pointer and restart files are out of sync. The pointer files 
-are text files that can be edited to match the correct dates 
+pointer and restart files are out of sync. The pointer files
+are text files that can be edited to match the correct dates
 of the restart and history files. All of the restart files should
 have the same date.
 
 ============================
 Archiving model output data
 ============================
 
-When a job has run successfully, the component log files are copied 
-to the directory specified by the **env_run.xml** variable ``$LOGDIR``, 
-which is set to **$CASEROOT/logs** by default. If the job aborts, log 
+When a job has run successfully, the component log files are copied
+to the directory specified by the **env_run.xml** variable ``$LOGDIR``,
+which is set to **$CASEROOT/logs** by default. If the job aborts, log
 files are NOT be copied out of the ``$RUNDIR`` directory.
 
 The output data flow from a successful run depends on whether or not
@@ -421,7 +421,7 @@ Short-term archiving
 
 If short-term archiving is enabled, component output files are moved
 to the short-term archiving area on local disk, as specified by
-``$DOUT_S_ROOT``. The directory normally is **$EXEROOT/../archive/$CASE.** 
+``$DOUT_S_ROOT``. The directory normally is **$EXEROOT/../archive/$CASE.**
 and has the following directory structure: ::
 
    rest/yyyy-mm-dd-sssss/
@@ -444,7 +444,7 @@ The **rest/** subdirectory contains a subset of directories that each contains
 a *consistent* set of restart files, initial files and rpointer
 files. Each subdirectory has a unique name corresponding to the model
 year, month, day and seconds into the day when the files were created.
-The contents of any restart directory can be used to create a branch run 
+The contents of any restart directory can be used to create a branch run
 or a hybrid run or to back up to a previous restart date.
 
 ---------------------
@@ -457,4 +457,56 @@ long-term archiver tool that supported mass tape storage and HPSS systems.
 However, with the industry migration away from tape archives, it is no longer
 feasible for CIME to support all the possible archival schemes available.
 
+============================
+Data Assimilation and other External Processing
+============================
+
+CIME provides a capability to run a task on the compute nodes either
+before or after the model run.  CIME also provides a data assimilation
+capability which will cycle the model and then a user defined task for
+a user determined number of cycles.
+
+
+---------------------
+Pre and Post run scripts
+---------------------
+
+Variables ``PRERUN_SCRIPT`` and ``POSTRUN_SCRIPT`` can each be used to name
+a script which should be exectuted immediately prior starting or
+following completion of the CESM executable within the batch
+environment.  The script is expected to be found in the case directory
+and will recieve one argument which is the full path to that
+directory.  If the script is written in python and contains a
+subroutine with the same name as the script, it will be called as a
+subroutine rather than as an external shell script.
+
+---------------------
+Data Assimilatin scripts
+---------------------
+
+Variables ``DATA_ASSIMILATION``, ``DATA_ASSIMILATION_SCRIPT``, and
+``DATA_ASSIMILATION_CYCLES`` may also be used to externally control
+model evolution.  If ``DATA_ASSIMILATION`` is true after the model
+completes the ``DATA_ASSIMILATION_SCRIPT`` will be run and then the
+model will be started again ``DATA_ASSIMILATION_CYCLES`` times.  The
+script is expected to be found in the case directory and will recieve
+two arguments, the full path to that directory and the cycle number.
+If the script is written in python and contains a subroutine with the
+same name as the script, it will be called as a subroutine rather than
+as an external shell script.
+
+..: A simple example pre run script.
+
+::
+
+   #!/usr/bin/env python
+   import sys
+   from CIME.case import Case
+
+   def myprerun(caseroot):
+       with Case(caseroot) as case:
+            print ("rundir is ",case.get_value("RUNDIR"))
 
+    if __name__ == "__main__":
+      caseroot = sys.argv[1]
+      myprerun(caseroot)
diff --git a/scripts/lib/CIME/SystemTests/dae.py b/scripts/lib/CIME/SystemTests/dae.py
@@ -7,6 +7,7 @@
 import os.path
 import logging
 import glob
+import gzip
 
 import CIME.XML.standard_module_setup as sms
 from CIME.SystemTests.system_tests_compare_two import SystemTestsCompareTwo
@@ -93,7 +94,7 @@ def run_phase(self): # pylint: disable=arguments-differ
         for fname in da_files:
             found_caseroot = False
             found_cycle = False
-            with open(fname) as dfile:
+            with gzip.open(fname, "r") as dfile:
                 for line in dfile:
                     expect(line[0:5] != 'ERROR', "ERROR, error line found in {}".format(fname))
                     if line[0:8] == 'caseroot':