-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the mosart runoff model #526
Conversation
Add the mosart runoff model and two new land->runoff coupling fields. This version of mosart is identical to mosart1_0_00 in the NCAR CESM repository and the modifications match the changes in the CESM exp_tag mosart01_cesm1_3_beta04. Added the land to runoff coupling fields Flrl_rofgwl and Flrl_rofsub. These are glacier/wetland/lake runoff and subsurface runoff. Both are liquid water fluxes. In CLM, these are currently set to zero. The two fields are passed through the coupler with Flrl_rofl and Flrl_rofi. Both rtm and mosart receive these fields and add them to the rofl flux term at present. This commit provides the ability to run mosart and to couple the two new fields. Some science development is still required in CLM and mosart to fully leverage the new capabilities. Three new compsets were added to support testing of mosart, IMCN, BM1850C5, and BM1850CN. These are identical to ICN, B1850C5, and B1850CN except mosart is used instead of rtm. Development and testing was done on titan only. The following tests were run to verify technical correctness of the mosart implementation. *CME.T31_g37.IMCN.titan_pgi *ERS_D.T31_g37.BM1850C5.titan_pgi *ERS.T31_g37.BM1850C5.titan_pgi *NCK.T31_g37.BM1850C5.titan_pgi *NCK.T31_g37.IMCN.titan_pgi *SMS_D_E.T31_g37.IMCN.titan_pgi *SMS_E.T31_g37.IMCN.titan_pgi *SMS.T31_g37.IMCN.titan_pgi *CME.f19_g16.X.titan_pgi And the acme-development test suite was also run and shown to be BFB with the starting ACME version. There is also a bug fix in component_mod.F90 to fix an error in the esmf coupling interface that is unrelated to mosart. [BFB]
This change syncs up the ACME+mosart version with the CESM mosart03_cesm1_4_beta03 version. In this version, mosart exact restart is fixed, Flrl_rofl is renamed to Flrl_rofsur, and Flrl_volrmch is added. There are a few other changes to mosart science as well. These changes produce identical bit-for-bit results with respect to climate. Some of the runoff coupling field names have been changed and a few new coupling fields have been added. This may impact the comparison results, but the results should be identical when comparing. LG-19 *********1*********2*********3*********4*********5*********6*********7** Longer commit message body describing the commit. Can contain lists as follows: * Item 1 * Item 2 * Item 3 A good commit message should be written like an email, a subject followed by a blank line, followed by a more descriptive body. Can also contain a tag at the bottom describing if the commit is Non-BFB or Climate changing: [Non-BFB] [CC]
* Add WRM source code * new file: models/rof/mosart/src/wrm/WRM_modules.F90 * new file: models/rof/mosart/src/wrm/WRM_read_print.F90 * new file: models/rof/mosart/src/wrm/WRM_returnflow.F90 * new file: models/rof/mosart/src/wrm/WRM_start_op_year.F90 * new file: models/rof/mosart/src/wrm/WRM_subw_IO_mod.F90 * new file: models/rof/mosart/src/wrm/WRM_type_mod.F90 * Add NLDAS resolution, datm mode, and IM_2000_CN_NLDAS compset * Add constance machine port WRM is not yet fully validated and is off by default. This is just an initial commit to provide NLDAS capability and to allow some code review. This is BFB with prior versions with WRM off.
fix parallelization bug in nUp sum for non-basin decompositions refactor mosart to improve performance - loop restructuring - reduce number of operations - improve performance of expensive math ops (sqrt, **, sin) Non-BFB or Climate changing: [Non-BFB]
- code cleanup of rtmini and rtmrun - works with mosart input files with scrambled IDs - moved dto term into rtmrun - added direct-to-outlet tranfer capability - removed a bunch of old rtm code - fixed esmf interfaces and tested in DEBUG mode - added budget calculation (still being validated) - has a known exact restart error that introduces a roundoff difference at the first timestep at a handful of gridcells. This is probably not going to impact science, will be fixed next. [Non-BFB] LG-20
*********1*********2*********3*********4*********5*********6*********7** Code cleanup to remove dead code and make the code clearer. More could be done. Update the direct output computation. Add budget diagnostics to verify conservation. In short tests, the model conserves except for in the frozen term in the euler solver. There is still a small issue there that needs further diagnosing. Review history and restart files and checked many fields. Updated the MOSART input file to adjust the rotated Antarctica. LG-20 [Non-BFB]
new input file new direct terms new areatot update history files [Non-BFB] LG-20
[BFB] LG-20
to MOSART_Global_half_20151205a.nc [Non-BFB] LG-20
Is this PR BFB? |
It should be BFB for configurations that don't use mosart. mosart is not currently part of the standard test suite. There may be some differences because we modified some of the coupling fields' names, but the results should ultimately be identically when running rtm. When you use mosart, you will have a new configuration with no baselines, and mosart will not be bit-for-bit with rtm. |
Added 'BFB'. |
@jgfouca , @rljacob, @douglasjacobsen: The merge of this PR produces relatively small number of conflicts but I have few issues that I need help with.
@apcraig : I'm trying to determine if the changes made to |
- The files are revert to ba21605, which was the hash of the master when this feature branch was created.
So, first you'll need to But for the other conflicts... I can't tell completely, but it looks like the lnd_import_export.F90 file is more like the CLM4.0 version than the CLM4.5 version. I think the changes in config_definition.xml should be pretty easy to "re-do" in the merge, in the right place. It looks like this is the diff: https://gist.github.com/douglasjacobsen/3dda495c3234140507a5 Here is the diff for the machines changes (i.e. config_compilers.xml, etc) https://gist.github.com/douglasjacobsen/5399fa13e64bb32d3bdd Those also look either relatively small, or unneeded (I don't know what constance and cab are). But it seems like it also deletes things like hera, which I also am not sure if we need or not. A bit about why git bailed on merging those files though. The way git does inexact rename matching is by trying to compare the new file against all old files in the repo, and see which file it's most similar to. If it can't find a file that it thinks it's similar enough to, it won't try to "rename" it. Additionally, if it finds multiple files that it's equally similar to, it won't rename it. In this case, I'm not exactly sure which the reason is, but it's probably one of those two. |
@douglasjacobsen : Thanks for the detail explanation. |
@bishtgautam |
@apcraig : Thanks. I will add clm mods in both clm40 and clm45. |
@apcraig : It appears that this is a non-BFB PR. There are differences in Below are testlogs of four tests done on Titan. https://gist.github.com/bishtgautam/6629b19c5dfa205f9778 |
We will need to look at the results carefully. There are some new coupling fields, that might be the reason the compare is flagging those fields. If the results are not bit-for-bit, then we should carefully review the mods. There were some changes to the way the clm fields are pass out of clm, fields were split into different terms. That could introduce a roundoff difference, but I think in my testing, that did not occur. Is there a way for me to help review the code changes? |
Hi @apcraig, A fully coupled case (1850_CAM5_CLM45%CN_MPASCICE%SSMI_MPASO_RTM_SGLC_SWAV) failed BFB test. The BFB failure wasn't unusual as we had already earlier determined that this PR had non-BFB code modifications. But, suprisingly the test still failed BFB even after I made above mentioned modifications to lnd_import_export.F90. The tar files for cases are available on NERSC at /project/projectdirs/acme/gbisht/pr-526-mosart/cases:
Any idea why SMS.ne30_m120.A_B1850CN.corip1_intel.C.20151223_155947 had a BFB failure? Note: Each of the three cases have two logfiles. The second log file in each case corresponds when the case was run with info_debug=2. |
i'll try to have a look in the next few days, but i may not get to it tony........ On 1/3/16 8:31 PM, Gautam Bisht wrote:
|
gautam, i want to be able to duplicate your tests. can you tell me exactly what thanks, tony.......... On 1/3/16 8:31 PM, Gautam Bisht wrote:
|
Hi Tony, I generated the baseline with
|
Gautam, Can you check permissions on /project/projectdirs/acme/gbisht/pr-526-mosart/cases looks like pr-526-mosart is not readable. tony........ On 1/3/16 8:31 PM, Gautam Bisht wrote:
|
Hi Tony, I have changed permissions to those directories. Please try again. |
looking at the log files, the first real divergence in "G" vs "C-155" is in the ocean model after the first ocean coupling period (hour 2). that difference seems to be very small in just a couple fields and is almost certainly roundoff. Based on the global sums, all the forcing TO the ocean model before that coupling period is the identical in the two runs. That likely means that there are a few gridpoints that are roundoff different in the ocean forcing that the global sums are not picking up, but that are producing a tiny difference in the ocean solution. The other interesting thing is that "G" vs "C-142" is that the ocean model diverges only on the third hour. The difference between C-155 and C-142 is that in one case, the sum of the land fields is done on the land side and in the other it's done on the runoff side. This seems to be enough to change answers by roundoff with C-142 actually being closer (with respect to ocean forcing) with the G case. I would argue that the lnd-runoff-ocean coupling in these cases is fine. The differences are tiny and consistent with roundoff. The changes to the sum that we hoped would be bit-for-bit are not in this case even though they were in other cases. That makes sense. That mod is not guaranteed to be bit-for-bit, the fact that is was before was helpful. What we're seeing now though is roundoff differences between the three cases in the lnd-rof-ocean coupling, probably just at a few gridcells, and that's Having said all that, the most worrisome issue is the fact that the atm diverges in all three simulations on the second hour in a way that's MUCH larger than the land-rof-ocean coupling. That seems to have nothing to do with the runoff coupling and I do not understand it. If you look at the 2 atm.log files in any run, you actually see that the two runs do NOT produce the same diagnostics bit-for-bit in the atm.log So, in summary, I would say the land-runoff-ocean coupling is fine as implemented. We can see that coupling separately from the atm divergence in the log files for at least the first few coupling periods. The land-runoff-ocean solutions are nearly identical in the three cases, are initially roundoff different at a few gridcells and that divergence grows. That seems fairly clear. But separate from that, there seems to be a huge problem in the atm model with reproducibility unrelated to the land-runoff-ocean Let me know if you have any questions. thanks, tony........ On 1/3/16 8:31 PM, Gautam Bisht wrote:
|
The non-BFB differences in the atmosphere must be related to the merge of this branch, right? Could the use of MPAS-O be an issue? Could the small changes in the ocean be causing large changes in the atmosphere? |
@bishtgautam : your comparison of 055722c with 865bc59 could be comparing models with different atmosphere? What about comparing 7d3a782 with it's branch point from master? (git has a command that will tell you the common parent of 7d3a782 and master). Then you could isolate the BFB changes introduced just by this code. That comparison may remove the atmosphere differences Tony mentioned, and verify that the branch only has the expected BFB differences. (just a suggestion based on reading Tony's message - please ignore if I'm missing something). |
055722c is one of the parent of 865bc59.
Since, no change to atmosphere code was made in this PR, I expect the atmosphere model to be exactly the same in 055722c and 865bc59. Thus, the non-BFB differences are baffling to me. |
My main concern is that each case was run twice, once with tony....... On 1/11/16 3:14 PM, Gautam Bisht wrote:
|
Hi Tony,
Ok, I now understand the point you are making.
To test this out, how about the using only 055722c to do the following: Baseline_1: Generate the baseline with Comparison_1: Compare with Baseline_1 with What do you think? |
info_debug should not change answers. if it does, there is baseline_1 and baseline_2 should be identical and in some ways, thanks, tony...... On 1/11/16 3:35 PM, Gautam Bisht wrote:
|
One related issue that may be relevant is that coupled cases that still used POP and CICE did not have any BFB differences with baselines after adding the mosart pieces. That's what I recall from the cdash results although its hard to trace. Might be worth checking BFB with baselines using the above 2 commits, an 1850_CAM5_CLM45%CN_CICE_POP2_RTM_SGLC_SWAV compset and ne30_g16 resolution. The atmosphere consistently passes ERS tests which should catch any problems with reproducibility. |
Hi Tony, My jobs on Cori has been sitting in the queue. Today I'm moving my testing to Titan and will keep you updated. |
These changes add a new model, mosart, as a runoff model. The changes to the scripts and coupling support those changes. In addition to the new source code under mosart, there were also small changes to rtm for consistency and a few changes to the coupling fields. A handful of new compsets were created to run with mosart, but additional compsets will be required as needed for science. LG-20 [non-BFB] Conflicts: cime/components/data_comps/datm/bld/namelist_files/namelist_definition_datm.xml cime/scripts/Tools/config_compsets.xml components/clm/bld/namelist_files/namelist_defaults_clm4_0.xml models/lnd/clm/src/cpl/lnd_import_export.F90 scripts/ccsm_utils/Case.template/config_definition.xml
@jgfouca : With this PR merged into master, I expect the following tests to fails on master during our nightly tests:
|
With a DIFF? |
Yes, with DIFF |
These changes add a new model, mosart, as a runoff model. The changes to the scripts and coupling support those changes. In addition to the new source code under mosart, there were also small changes to rtm for consistency and a few changes to the coupling fields. A handful of new compsets were created to run with mosart, but additional compsets will be required as needed for science. LG-20 [non-BFB] Conflicts: cime/components/data_comps/datm/bld/namelist_files/namelist_definition_datm.xml cime/scripts/Tools/config_compsets.xml components/clm/bld/namelist_files/namelist_defaults_clm4_0.xml models/lnd/clm/src/cpl/lnd_import_export.F90 scripts/ccsm_utils/Case.template/config_definition.xml
These changes add a new model, mosart, as a runoff model. The changes to the scripts and coupling support those changes. In addition to the new source code under mosart, there were also small changes to rtm for consistency and a few changes to the coupling fields. A handful of new compsets were created to run with mosart, but additional compsets will be required as needed for science. LG-20 [non-BFB] Conflicts: cime/components/data_comps/datm/bld/namelist_files/namelist_definition_datm.xml cime/scripts/Tools/config_compsets.xml components/clm/bld/namelist_files/namelist_defaults_clm4_0.xml models/lnd/clm/src/cpl/lnd_import_export.F90 scripts/ccsm_utils/Case.template/config_definition.xml
These changes add a new model, mosart, as a runoff model. The changes to the scripts and coupling support those changes. In addition to the new source code under mosart, there were also small changes to rtm for consistency and a few changes to the coupling fields. A handful of new compsets were created to run with mosart, but additional compsets will be required as needed for science. LG-20 [non-BFB] Conflicts: cime/components/data_comps/datm/bld/namelist_files/namelist_definition_datm.xml cime/scripts/Tools/config_compsets.xml components/clm/bld/namelist_files/namelist_defaults_clm4_0.xml models/lnd/clm/src/cpl/lnd_import_export.F90 scripts/ccsm_utils/Case.template/config_definition.xml
Changes to F90 files are suspect. I just went with the CSEG side for all conflicts. Config archive was changed to support MOSART on both sides, but in very different ways. Again, went with CSEG. * acme_master: (25 commits) bless_test_results: Finally add a full regression test for this tool Fix reference to unassigned variable. Better use of return codes in key ACME tools. Fix failing regression test. Add gnu modules to env for redsky and skybridge Add total test time to CDash field 'OS Version' Merge branch 'apcraig/mosart/add-mosart' (PR #526) Fixing odd number of tasks to handle Further fixes to the PEA test Add dynamic ozone treatment to ATMMOD compset Update r2o mapping file for ne30_ec60 grid Fixing link issues on Cetus+Mira after Albany build changes Turn on baseline comparison for redsky Implementation of linoz_mam4_resus_mom and hooks for v1. Merge branch 'singhbalwinder/atm/gold2718-new-infrastructure'(PR #343) Fixing PEA_P1_M.f45_g37_rx1.A.edison_intel Updates to BUILD_THREADED Updating default walltimes to use Edison's values Slurm time and partition directives Fixing cray-mpich module version to 7.3.0 ... Conflicts: driver_cpl/driver/prep_rof_mod.F90 driver_cpl/driver/seq_diag_mct.F90 driver_cpl/shr/seq_flds_mod.F90 machines/config_pes.xml scripts-python/jenkins_generic_job scripts/Testing/Testcases/config_tests.xml scripts/Tools/case.build scripts/Tools/cesm_setup scripts/Tools/config_archive.xml scripts/Tools/config_definition.xml scripts/Tools/config_grid.xml scripts/Tools/st_archive scripts/create_newcase utils/perl5lib/Batch/BatchMaker.pm
These changes add a new model, mosart, as a runoff model.
The changes to the scripts and coupling support those changes.
In addition to the new source code under mosart, there were also
small changes to rtm for consistency and a few changes to the coupling fields.
A handful of new compsets were created to run with mosart, but
additional compsets will be required as needed for science.
LG-20
[non-BFB]