-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up input-data directories; Fix MERRA2 input; Add tests on Jet; Update CICE cap & fix time manager (was PR#664) #639
Clean up input-data directories; Fix MERRA2 input; Add tests on Jet; Update CICE cap & fix time manager (was PR#664) #639
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice cleanup work! Good to see that all these tests that didn't work on Cheyenne or Jet in the past are now ok. We fixed a lot of bugs in the past few months, it seems.
Machine: jet |
Machine: cheyenne |
@@ -36,7 +36,7 @@ export JNPES=6 | |||
export WARM_START=.T. | |||
export NGGPS_IC=.F. | |||
export EXTERNAL_IC=.F. | |||
# DH* The correct setting would be .F.? However the official | |||
# DH* The correct setting would be .F.? However the official | |||
# regression test baseline uses MAKE_NH=.T. | |||
#export MAKE_NH=.F. | |||
export MAKE_NH=.T. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you don't change the MAKE_NH to be .F.? Maybe we need to change it in next PR when baseline will be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to set it to F then I created a new baseline and tried to compare against it using the tests as-is (changing only this value). It didn't reproduce and I didn't understand how the test was being set up so I left it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regional control is using FMS, so the history files contain fields at previous output time which do not exist in the restart run history files. Do you compare the restart files from those two runs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all I did was to change the make_nh, create a new baseline and then ran the regional_control and regional_restart tests against that baseline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clean up!
The jet baselines all created except for
This test was previously running on Jet, albeit with the incorrect Merra input. |
Maybe compare the control_csawmg test with the same test in the current
develop branch? I think there should be no change for that test.
…On Mon, Jul 19, 2021 at 5:14 PM Denise Worthen ***@***.***> wrote:
The jet baselines all created except for control_csawmg, which is failing
after 5 hours with
FATAL from PE 140: NaN in input field of mpp_reproducing_sum(_2d), this indicates numerical instability
This test was previously running on Jet.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#639 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AI7D6TKAVLERN674XYLPAF3TYSIU3ANCNFSM46V7FNLQ>
.
|
In my earlier testing, all the csawmg tests changed because the merra2 input changed. I don't understand the test name, but it has USE_MERRA2=.T.. |
I guess I found the issue. The two tests control_csawmg and control_csawmgt have different IAER (1111 and 111) and they have USE_MERRA2=.T. and are using the merra2 data /scratch1/NCEPDEV/nems/emc.nemspara/RT/NEMSfv3gfs/input-data-20210614/FV3_input_data_INCCN_aeroclim/MERRA2. Now only IAER=1011 (control_merra2 related tests) is using MERRA2 data. I think we still need the USE_MERRA2=.T. logic to set up aerosol data under MERRA2 directory for all those tests. |
For the cheyenne.intel wave_p7b test failure, I tried multiple times but got the same I then made a non-wave p7b test in order to put it in debug mode. That test also fails slightly earlier (in |
I didn't change anything in the csawmg tests. I only added them to cheyenne. |
Sorry, I thought the USE_MERRA2 was removed from control_run.IN. It is OK to use if IAER=1011 in cpld_bmark_tiled_run.IN for now, but in general other IAER options are using MERRA2 data too. |
*remove csamwg test on jet in rt.conf; this test fails with FATAL from PE 140: NaN in input field of mpp_reproducing_sum(_2d), this indicates numerical instability *repeat of control_2threads test which timed out on first verification run
…/ufs-weather-model into feature/updateBMIC
* job fails at startup with message MPT: shepherd terminated: r5i4n4.ib0.cheyenne.ucar.edu - job aborting
All platforms are now complete and I think we can start updating the submodules. |
## DESCRIPTION OF CHANGES: 1. Add a new experiment configuration variable named `DEBUG` to enable more in-depth debugging output from workflow scripts. Set default value of `DEBUG` in `config_defaults.sh` to `"FALSE"`. 2. In experiment generation scripts, change circumstances under which different messages are printed to screen (e.g. when `VERBOSE` is `"TRUE"`, when `DEBUG` is `"TRUE"`, or always). 3. In experiment generation scripts, for clarity add new informational messages and modify some existing ones. 4. In various scripts, change "set -x" to "set +x" to reduce output clutter. This can be changed back as necessary (e.g. for debugging). Note that if `DEBUG` is set to `"TRUE"`, `VERBOSE` will get reset to `"TRUE"` if necessary in order to also print out all the `VERBOSE` messages. ## TESTS CONDUCTED: Ran the WE2E test `grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2` as-is as well as with modifications to the default values of `VERBOSE` and `DEBUG`, as follows: 1. `grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2` as-is, i.e. using default values `VERBOSE="TRUE"` and `DEBUG="FALSE"`. 2. `grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2` modified with `VERBOSE="FALSE"` (and with default of `DEBUG="FALSE"`). 3. `grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2` modified with `DEBUG="TRUE"` (and with default of `VERBOSE="TRUE"`). 4. `grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2` modified with `DEBUG="TRUE"` and `VERBOSE="FALSE"` (which should get reset to `"TRUE"`). All tests were successful. The experiment generation log files (`log.generate_FV3LAM_wflow.sh`) were compared and differed in the expected ways. ## DOCUMENTATION: Necessary documentation of `DEBUG` is in `config_defaults.sh`. Created Issue #[640 ](https://github.com/NOAA-EMC/regional_workflow/issues/640)to also update rst documentation.
PR Checklist
Ths PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR. Please consult the ufs-weather-model wiki if you are unsure how to do this.
This PR has been tested using a branch which is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR
An Issue describing the work contained in this PR has been created either in the subcomponent(s) or in the ufs-weather-model. The Issue should be created in the repository that is most relevant to the changes in contained in the PR. The Issue and the dependent sub-component PR
are specified below.
If new or updated input data is required by this PR, it is clearly stated in the text of the PR.
Instructions: All subsequent sections of text should be filled in as appropriate.
The information provided below allows the code managers to understand the changes relevant to this PR, whether those changes are in the ufs-weather-model repository or in a subcomponent repository. Ufs-weather-model code managers will use the information provided to add any applicable labels, assign reviewers and place it in the Commit Queue. Once the PR is in the Commit Queue, it is the PR owner's responsiblity to keep the PR up-to-date with the develop branch of ufs-weather-model.
Description
Creates new input-data and BM_IC directories by removing un-used coupled inputs from FV3_input_frac and BM_IC-20210212. Moves BM_IC and BM7_IC from FV3_input_frac to new BM_IC-YYYYMMDD directory.
Two new input directories are currently staged in
/scratch1/NCEPDEV/stmp4/Denise.Worthen/input-data-20210630
and/scratch1/NCEPDEV/stmp4/Denise.Worthen/BM_IC-20210630
. These were initially created by rsync-ing the input-data-20210614 and BM_IC-20210212 directories from the baseline area on 20210614 and then making changes as required to remove un-used inputs or reorganize current BM inputs. The date of 0630 is arbitrary.Copies of
nems.configure
andmodel_configure
in all input directories were removed, the copy-in of configure files in fv3_conf/*_run.IN scripts for the standalone tests were removed. All standalone tests for both intel and gnu passed.Copies of
data_table
in all input directories were removedNew W3 inputs are required for the BM_IC_YYYYMMDD directory for use in the 35d tests; these have been added from
/scratch2/NCEPDEV/climate/Jessica.Meixner/WW3ICGEFS/RestartFiles
.The P7 surface ICs have been corrected and verified to be the same as those in
prototype7-input-data-20210608/FV3_input_frac/BM7_IC
The correct Merra2 ICs have been added to
FV3_input_data_INCCN_aeroclim/MERRA2
.The file
mom6_increment.nc
has been added toMOM6_IC/100/2011100100
. No new input data should be needed for PR MOM6 IAU and atmos stochy restart test #668The c384 ugwd fix files have been copied from
/scratch1/NCEPDEV/nems/emc.nemspara/RT/NEMSfv3gfs/prototype7-input-data-20210608/FV3_input_data384/INPUT_L127
toFV3_input_data384/INPUT_L127
The current input-data-20210614 is 311G; the current BM_IC-20210212 is 195G
The new input-data-20210630 is 121G; the new BM_IC-20210630 is 199G.
Issue(s) addressed
Fixes #638
Fixes #675
Fixes #680
Fixes #681
Fixes CICE #30
Fixes #647
Testing
Testing at commit f16dcb4 against develop-20210712 shows the following results:
The following fail because of the correction to the MERRA2 input data and the addition of AOD variables to the forecast files:
The following fail because of the correction to the MERRA2 input data:
The following fails because of the fix to the global_ca variable:
The following fails because of the fix to the surface ICs:
Tests were repeated after merging the CICE update PR and the same results were obtained.
NOTE: commit 330bb0e changed
dt_atmos
from 225s to 300s for all bmark_v16 tests. This will now change all bmark_v16 baselines.How were these changes tested? What compilers / HPCs was it tested with? Are the changes covered by regression tests? (If not, why? Do new tests need to be added?) Have regression tests and unit tests (utests) been run? On which platforms and with which compilers? (Note that unit tests can only be run on tier-1 platforms)
Dependencies
PR #654
CICE PR #32
Icepack PR #5
NEMS PR # 106