Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small updates to output variables, addition of FATES_FRACTION variable #854

Merged
merged 10 commits into from
Jul 15, 2022

Conversation

adrifoster
Copy link
Contributor

This updates a few output variable names and metadata based on some typos that were found, plus adds an additional variable FATES_FRACTION which is the fraction of the HLM gridcell occupied by FATES.

Description:

We create an hio_fates_fraction_si variable which is set to 1.0, since it will be zero (see below) on non-fates columns. The average will then be the total gridcell fates fraction. (see here)

Because of our previous history interface update which flushes all FATES variables to the hlm_hio_ignore_val, we needed to flush this specific variable to zero for this method to work. I added an optional flush_to_zero argument to the set_history_var subroutine (here) which will prompt the subroutine to flush that variable to zero.

Collaborators:

@ckoven

Expectation of Answer Changes:

None, only change should an additional variable and some small changes to variable names and metadata.

Checklist:

  • My change requires a change to the documentation.
  • I have updated the in-code documentation .AND. (the technical note .OR. the wiki) accordingly.
  • I have read the CONTRIBUTING document.
  • FATES PASS/FAIL regression tests were run
  • If answers were expected to change, evaluation was performed and provided

Test Results:

CTSM (or) E3SM (specify which) test hash-tag:

CTSM (or) E3SM (specify which) baseline hash-tag:

FATES baseline hash-tag:

Test Output:

@glemieux
Copy link
Contributor

glemieux commented Jun 1, 2022

I'd planning on coordinating this PR with ESCOMP/CTSM#1515 to update the history variable names in the ctsm test mods.

Copy link
Contributor Author

@adrifoster adrifoster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!!! Thank you!!

@glemieux
Copy link
Contributor

glemieux commented Jun 1, 2022

Aside from the expected NLCOMP and FIELDLIST DIFFs, all expected fates tests pass b4b.

  • cheyenne: /glade/u/home/glemieux/scratch/ctsm-tests/tests_pr854_ctsm1515-fates
  • izumi: /scratch/cluster/glemieux/ctsm-tests/tests_pr854_ctsm1515-fates

UPDATE: ERS_D_Ld5.1x1_brazil.I2000Clm50FatesCruRsGs.izumi_nag.clm-FatesColdDefHydro was incorrectly marked as expected failure. See ESCOMP/CTSM#1525 for further details. I should make certain that this PR wasn't the cause of the "new" COMPARE_base_rest failure mode, however unlikely.

Test results for aux_clm will be reported on ESCOMP/CTSM#1515

@glemieux
Copy link
Contributor

glemieux commented Jun 2, 2022

The izumi version of ERS_D_Ld5.1x1_brazil.I2000Clm50FatesCruRsGs.izumi_nag.clm-FatesColdDefHydro failing is indeed due to this PR. Reviewing the DIFF shows that it is failing with an error similar to #701 (note that SCPF suffix was changed to SZPF recently:

 FATES_ERRH2O_SZPF   (lndgrid,fates_levscpf,time)  t_index =      6     6
          2      156  (     0,    41,     1) (     0,    54,     1) (     0,    41,     1) (     0,    28,     1)
                 156   3.258187497579002E-09  -3.006245396560824E-16 3.3E-09  3.258187497579002E-09 1.3E-02  5.553617565823288E-10
                 156   2.259984064920486E-15  -3.006245396560824E-16          2.259984064920486E-15          3.389469836376067E-17
                 156  (     0,    41,     1) (     0,    54,     1)
          avg abs field values:    2.444583598049110E-11    rms diff: 2.6E-10   avg rel diff(npos):  1.3E-02
                                   2.229240666610201E-17                        avg decimal digits(ndif):  0.0 worst:  0.0
 RMS FATES_ERRH2O_SZPF                2.6463E-10            NORMALIZED  2.1650E+01

Interestingly, this isn't an issue with the intel version on Cheyenne (which is where the original issue was discovered). I will run this test on the intel compiler on izumi to rule out machine differences.

@glemieux
Copy link
Contributor

glemieux commented Jun 2, 2022

The intel version of this test was a bust. It's failing out very early due to some ESMF io errors. Output is here:
/scratch/cluster/glemieux/ctsm-tests/tests_pr854_ctsm1515-comparebasecheck-intel.
@ekluzek is this something specific with izumi and esmf?

This set of changes allows the harvesting module to pass products back to the host. API modifications are still required in CLM, but this feature should work fully with ELM.
@glemieux
Copy link
Contributor

glemieux commented Jul 6, 2022

Aside from the expected NLCOMP and FIELDLIST differences, nearly all tests are passing b4b:

  • Izumi: /scratch/cluster/glemieux/ctsm-tests/tests_pr854-fates
  • Cheyenne: /glade/u/home/glemieux/scratch/ctsm-tests/tests_pr854_fates

The one test not b4b is failing to run on Izumi:
ERS_D_Ld5.1x1_brazil.I2000Clm50FatesCruRsGs.izumi_nag.clm-FatesColdDefHydro.GC.pr854-fates_nag
The error report is:

[0] Runtime Error: [0] *** Arithmetic exc[0] eption: Float[0] ing overflow - aborti[0] ng
[0] /home/glemieux/ctsm/src/main/ncdio_pio.F90.in,[0]  line [0] 2039: Error o[0] ccurr[0] ed in NCDIO_PIO:NCD_IO_2D_DOUBLE
[0] /home/glemieux/ctsm/src/main/histFileMod.F90, line[0]  3581: Called by HISTFILEMOD:HFIELDS_WRITE
[0] /home/glemieux/ctsm/src/main/histFileMod.F90, line 4099[0] : Called by HISTFILEMOD:HIST_HTAPES_WRAPUP
[0] /home/glemieux/ctsm/src/main/clm_driver.F90, line 1440: Cal[0] led by CLM_DRIVER:CLM_DRV
[0] /home/glemieux/ctsm/src/cpl/nuopc/lnd_comp_nuopc.F90, lin[0] e 893: Cal[0] led by LND_COMP_NUOPC:MODELADVANCE[0]
[0] /home/glemieux/ctsm/components/cmeps/cime_config/../cesm/driver/esmApp.F90, line 141: Called b[0] y ESMAPP
[0] [i041.cgd.ucar.edu:mpi_rank_0][error_sighandler] Caught error: Aborted (signal 6)

UPDATE: writing out the varname during the above routine calls it looks like the issue is again with FATES_ERRH2O_SZPF.

@glemieux
Copy link
Contributor

glemieux commented Jul 7, 2022

@rgknox the issue appears to be with the ccohort_hydr%errh2o. There are certain iscpf indices for which this variable is in the E+180 range and higher which I think is causing the overflow. Writing out the intermediate variables that go into calculating the error there doesn't appear to be anything approaching those values. My guess is that since errh2o isn't initialized to any particular value, that some cohort is not having this value calculated and then using a random garbage value. Thoughts?

@glemieux glemieux linked an issue Jul 8, 2022 that may be closed by this pull request
@glemieux
Copy link
Contributor

glemieux commented Jul 9, 2022

Retesting after applying 9d9c192, ERS hydro test passes on Izumi with the nag compiler now and also fixes #701. The other Izumi hydro test is not b4b anymore against the latest baseline due to this update. All Cheyenne tests have the same results as noted above in #854 (comment).

File locations:

  • Izumi: /scratch/cluster/glemieux/ctsm-tests/tests_pr854-fates2
  • Cheyenne: /glade/u/home/glemieux/scratch/ctsm-tests/tests_pr854-fates-errh20fix

@glemieux glemieux merged commit def6b3e into NGEET:master Jul 15, 2022
@adrifoster adrifoster deleted the history_interface_ilamb branch May 10, 2023 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FATES_ERRH2O_SCPF fails COMPARE_base_rest
4 participants