Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add running mean functions #724

Merged
merged 25 commits into from
Dec 3, 2021
Merged

Add running mean functions #724

merged 25 commits into from
Dec 3, 2021

Conversation

rgknox
Copy link
Contributor

@rgknox rgknox commented Mar 17, 2021

Description:

Add capacity to track running means in FATES. This adds the code to perform running means, and allows these variables to be attached at any FATES scale, including site, patch and cohort. Only the 24-hour vegetation temperature was converted over as of yet.

Fixes #722

Should be synchronized with: ESCOMP/CTSM#1304

Integrate after: #703

Collaborators:

@ckoven @Qianyuxuan Claire Zarakas @billsacks @dlawrenncar @ekluzek

Expectation of Answer Changes:

This should change answers, but my expectation is that it should change them subtly. The 24hour vegetation temperature that was previously calculated in CLM/ELM should now have similar, but different values now that we are allowing patch-persistence to influence the calculation. This value is used in mortality, fire and phenology.

Checklist:

  • I have updated the in-code documentation .AND. (the technical note .OR. the wiki) accordingly.
  • I have read the CONTRIBUTING document.
  • FATES PASS/FAIL regression tests were run
  • If answers were expected to change, evaluation was performed and provided

Test Results:

CTSM (or) E3SM (specify which) test hash-tag:

CTSM (or) E3SM (specify which) baseline hash-tag:

FATES baseline hash-tag:

Test Output:

…mory (ie simple moving average) in a window in favor of the exponential moving window, 2) added fixed window capabilities.
@rgknox
Copy link
Contributor Author

rgknox commented Mar 19, 2021

After discussions with @billsacks @dlawrenncar @ekluzek @ckoven , I've dropped support for a simple moving averages. The SMA requires that all independent points of memory over a window of time are maintained, we all agreed this would be too memory intensive. Now, moving windows only use the exponential moving average (EMA), which maintains 1 point of memory. This essentially updates a single point of memory with a weighting function where mean = (1-alpha)mean + alphaupdate_value. The alpha is the fraction of the window in which an update occurs (ie the frequency).
I also added code in to enabled fixed window averages, where the mean is zero'd periodically, the update weights are uniform over the window, and the mean is only relevant when the window is complete. Although, I do not have a test case for using the fixed window yet.
Overall, maintaining a running average now requires 3 pieces of scalar data, the current state of the mean, where (temporally) the process is within its window (progress), and the latest mean value (which alllows us to remember yesterday's mean after we have to re-zero our current mean for fixed windows).

main/FatesRunningMeanMod.F90 Outdated Show resolved Hide resolved
@rgknox
Copy link
Contributor Author

rgknox commented Mar 30, 2021

Code has been updated with the following:

  1. added a copy function that allows patches or cohorts with running means to pass on their memory to a newly spawned entity
  2. a 24-hour veg temp history variable that is discretized by patch age
  3. restart machinery

Testing this now, will report back

@rgknox
Copy link
Contributor Author

rgknox commented Mar 31, 2021

Ran a 20 year test at BCI using the default pft set. Simulations were initialized with inventory.
Comparing to base, there are subtle differences in forest biomass response.

rmean-comp_plots.pdf

Here are some plots for just the new branch with the running mean functions, which have a new 24hr veg temperature history variable:
rmean-rest_plots.pdf

@ckoven
Copy link
Contributor

ckoven commented Mar 31, 2021

plots looks reasonable, we'd expect there to be some small differences if we're calculating the respiration temperature slightly differently. thanks @rgknox!

@rgknox
Copy link
Contributor Author

rgknox commented Mar 31, 2021

well, respiration uses the instantaneous temperatures right now, so in theory, it should not be affected by these changes. I did however notice that the number of chill days (phenology) had a non-zero (strange) starting value for the base run, and was set to zero (expected) in the new run. But it still seemed strange to me that this would potentially have the effect that we are seeing. Maybe there is a small amount of fire happening?

@rgknox
Copy link
Contributor Author

rgknox commented May 11, 2021

ran a special test where I held the temperature controlling frost mortality to a constant 20 degrees, then compared this branch to its base, the results were then indistinguishable.

https://github.com/NGEET/fates/blob/master/biogeochem/EDMortalityFunctionsMod.F90#L177

timers2_plots.pdf

I also printed out the temperature on the first call to calculate frost mortality (in the base) and the value was -273 C. A little frosty. On the new branch the temperature is 23 C on the first timestep (at BCI).

@ckoven
Copy link
Contributor

ckoven commented May 12, 2021

thanks @rgknox. so then it sounds like there was a bug in the old code then, which you've now fixed. great!

Copy link
Contributor

@ckoven ckoven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rgknox, this looks great. A thing whose significance has only just occurred to me though, upon staring hard at the code just now, is whether we should be using the fixed window rather than the exponential window for the special case of the 24-hour smoothing?

My thinking on this is the following: because FATES's daily timestep is happening synchronized around the planet all at once, it will fall at a different time of day (locally) around the world. And because the exponential moving average still has some amplitude over any given cycle (~10% I think, based on playing around in jupyter), which we'd then be sampling at different phases of the diurnal cycle at different sites around the world, it means that the 24-hour mean temperature will have a longitudinally-dependent high or low bias relative to what we'd get with the fixed moving window. And maybe its not a big deal, but also a longitudinally-dependent bias strikes me as something we'd want to avoid?

So I guess I'm ok with pulling this in as-is for now, since we've had this bias all along and (I at least) just hadn't perceived it, but maybe we should put switching the tveg24 to the fixed window method on the to-do list as well? Unless it'd be a simple swap in this PR to change the method (which maybe it is, as the machinery that you've built to do so is really very nice)?

implicit none
private

integer, parameter :: maxlen_varname = 8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see where this is used. is it?

main/FatesRunningMeanMod.F90 Outdated Show resolved Hide resolved
! over their construction period.


integer, public, parameter :: moving_ema_window = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you write in the comments here that 'EMA' refers to exponential moving average? It doesn't say so anywhere in the code right now (or at least I didn't see it).


! Define the time methods that we want to have available to us

class(rmean_def_type), public, pointer :: ema_24hr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

define ema again here in comments?

main/FatesRunningMeanMod.F90 Outdated Show resolved Hide resolved
main/FatesRunningMeanMod.F90 Outdated Show resolved Hide resolved
@rgknox
Copy link
Contributor Author

rgknox commented May 12, 2021

@ckoven , your point about the longitudinal bias sounds real to me. We don't want some sites with more memory of yesterday's mid-day, and some sites with more memory of yesterday's night.

Changing the current 24 veg temp to a fixed window will be a good way to double check the coding on the fixed window. Will do.

rgknox and others added 4 commits May 12, 2021 10:14
Co-authored-by: Charlie Koven <ckoven@users.noreply.github.com>
Co-authored-by: Charlie Koven <ckoven@users.noreply.github.com>
Co-authored-by: Charlie Koven <ckoven@users.noreply.github.com>
@rgknox
Copy link
Contributor Author

rgknox commented May 13, 2021

I re-worked the code to test out using the fixed-window average. One thing I noticed, that kind of caught me off guard: When I set a model run-time that starts on the 0 hour, there are 2-half hour model steps in clm/elm before our first dynamics step. This was strange to me, as there should had been either 48 or 0 steps. This would effect how our windows are applied because by the time that fixed window average is completed, we will have to wait 1 hour until the new value is used, so it will be slightly out of date. I suppose 1 hour is not the worst lag, but still, did not expect this and this should be addressed.
So, tldr, I'm reviewing how the end-of-day flag is assessed when triggering a fates dynamics call.

@rgknox
Copy link
Contributor Author

rgknox commented May 13, 2021

We use the is_beg_curr_day() function to decide when a fates dynamics call occurs, here is the function:
https://github.com/ESCOMP/CTSM/blob/master/src/utils/clm_time_manager.F90#L1417-L1438

Note that it checks if the time of day tod is equal to the time-step. So this make sense with the timing I'm seeing. The first timestep, tod = 0. The second time-step, tod=dtime

@rgknox
Copy link
Contributor Author

rgknox commented May 14, 2021

@ckoven and I discussed the issue that 24-hour fixed window running means will represent a window that is lagged 1 hour from the time of the dynamics call and feel this is acceptable for the time being. Modifying the call sequence in the hlm driver could have significant unforseen consequences and make a mess, and would be better left to its own PR and investigation.

@glemieux glemieux added the ctsm An issue is related to ctsm host land model or a particular PR has a corresponding ctsm-side PR label May 26, 2021
This was referenced Jun 18, 2021
@glemieux
Copy link
Contributor

glemieux commented Oct 19, 2021

I merged up to dev061 on the ctsm side and started testing with the fates suites on izumi and cheyenne. The cheyenne tests are still running, but I'm seeing the ERS test types failing the RUN on both machines:

 3271 186:At line 1863 of file /glade/u/home/glemieux/ctsm/src/fates/main/FatesRestartInterfaceMod.F90
 3272 186:Fortran runtime error: Index '1' of dimension 1 of array 'rio_recrate_sift' above upper bound of 0
 3273 186:

I haven't had a chance to look into it much yet, but from the checking out the branch diffs against master in FatesRestartInterfaceMod, I don't see anything jumping out at me right away.

@rgknox
Copy link
Contributor Author

rgknox commented Oct 20, 2021

@glemieux , I can't see anything yet either... I don't think its anything specific to rio_recrate_sift, as that is just the first restart variable encountered. My guess is that the new variables have somehow messed with the indexing system used to assign the restart variables, and not their spatial coordinates, or the cohort ordering coordinates, because we really don't make any changes to that in the code... Still looking, still stumped.

@ckoven
Copy link
Contributor

ckoven commented Oct 20, 2021

@rgknox @glemieux Could the problem be that this moves the call to CLMFatesGlobals down from its original position here? https://github.com/ESCOMP/CTSM/pull/1304/files#diff-d4c6f8b5f27f4bc52e2f9f7fabb0b7e329cdc1291464b478c31b14aa9366b566L238

@rgknox
Copy link
Contributor Author

rgknox commented Oct 20, 2021

yes, that is the problem, thanks @ckoven

@rgknox
Copy link
Contributor Author

rgknox commented Oct 20, 2021

see: ESCOMP/CTSM#1304 (comment)

…ndant with patch-level given how the HLMs handle temperature physics
@rgknox
Copy link
Contributor Author

rgknox commented Oct 24, 2021

The following report compares the branch in this PR (as merged) against master (ctsm: dev61, fates:sci.1.48.0_api.17.0.0). It compares the annual mean biomass, annual maximum LAI, and annual mean biomass partitioned by PFT, for the 60th year of simulation (from a cold-start).

rmeans_v0-f10_f10_mapplots.pdf

The expectation is that the new averaging functions should change results.

@rgknox
Copy link
Contributor Author

rgknox commented Nov 30, 2021

I did a closer inspection of how this branch gets 24hour vegetation temperature, and compared it with base (master). I can see two differences: firstly, and this is a known bug, is that the master has a zero initalized 24-hour veg temperature in the first day. In the new branch, we initialize on a cold start at 15deg C. Secondly, the starts and stops of the 24 hour cycles are slightly out of phase.
Here are some plots of a short test I ran at BCI:

In this plot we see the initial conditions are way different:
tveg24_all

In this plot we zoom in on a few days and see the slight phase difference in where the 24 cycle starts and stops:
Figure_2

I'm now going to see if I can somehow force the 24-hour timer cycle to restart following dynamics (if its not already done). Either way, I think this branch is pretty close to being resolved and ready.

@rgknox
Copy link
Contributor Author

rgknox commented Dec 1, 2021

After reviewing the code and printing diagnostics, I verified that we are getting 24-hour means generated at the right time and they are in-phase with the dynamics step. I'm going to start regression testing.

@rgknox
Copy link
Contributor Author

rgknox commented Dec 1, 2021

Tests for the fates test suite are as expected, on cheyenne:

/glade/scratch/rgknox/tests_1201-124452ch

Will now run aux_clm

@rgknox rgknox merged commit b271492 into NGEET:master Dec 3, 2021
ekluzek added a commit to jedwards4b/ctsm that referenced this pull request Dec 9, 2021
This set of changes synchronizes CTSM with API 20 of FATES.  API 20 of FATES requires a modification to timing boundary conditions so that FATES can maintain its own internal running means.  FATES also no-longer asks CTSM for the 24hour vegetation temperature, since this is a patch(pft) level variable, and FATES has more suitable averaging mechanisms.  These changes are synchronized with FATES Pull Request: NGEET/fates#724, which subsequently created tag: https://github.com/NGEET/fates/releases/tag/sci.1.52.0_api.20.0.0
slevis-lmwg added a commit to slevis-lmwg/ctsm that referenced this pull request Dec 11, 2021
This set of changes synchronizes CTSM with API 20 of FATES.  API 20 of FATES requires a modification to timing boundary conditions so that FATES can maintain its own internal running means.  FATES also no-longer asks CTSM for the 24hour vegetation temperature, since this is a patch(pft) level variable, and FATES has more suitable averaging mechanisms.  These changes are synchronized with FATES Pull Request: NGEET/fates#724, which subsequently created tag: https://github.com/NGEET/fates/releases/tag/sci.1.52.0_api.20.0.0
olyson added a commit to fang-bowen/CTSM that referenced this pull request Dec 13, 2021
This set of changes synchronizes CTSM with API 20 of FATES.  API 20 of FATES requires a modification to timing boundary conditions so that FATES can maintain its own internal running means.  FATES also no-longer asks CTSM for the 24hour vegetation temperature, since this is a patch(pft) level variable, and FATES has more suitable averaging mechanisms.  These changes are synchronized with FATES Pull Request: NGEET/fates#724, which subsequently created tag: https://github.com/NGEET/fates/releases/tag/sci.1.52.0_api.20.0.0
@glemieux glemieux mentioned this pull request Dec 14, 2021
5 tasks
@rgknox rgknox deleted the fates_runmean_vars branch October 31, 2023 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ctsm An issue is related to ctsm host land model or a particular PR has a corresponding ctsm-side PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FATES internal running means
3 participants