Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ocean wetting-and-drying ramp feature #46

Conversation

cbegeman
Copy link
Collaborator

@cbegeman cbegeman commented Mar 31, 2023

This PR enhances the existing wetting-and-drying algorithm using an approach from O'Dea et al. 2020 (doi:10.1016/j.ocemod.2020.101708). Instead of zeroing out normalVelocity and normalVelocity tendencies in cells where the projected layerThickness drops below the user-defined minimum, this method ramps down the normalVelocity and its tendencies between a range of layerThicknesses.

[BFB][stealth]

Jessica F Needham and others added 12 commits March 6, 2023 15:53
The reprosum algorithm converts each floating point summand into a
vector of integers representation. These vectors are summed locally
first, and then globally (using an mpi_allreduce). All of this is
exact and reproducible. The final step is to convert the final sum, in
the vector of integers representation, back into a floating point
value.

For this last step, each component integer in the vector
representation is converted back into a floating point value (call it
a_{i} for the ith element of the vector), and then these values are
summed locally, smallest to largest, to get the final value. The
integer representation is first massaged so that all a_{i} have the
same sign and the range of values is usually not overlapping (least
significant binary digit in abs(a_{i}) represents a value larger than
that represented by abs(a_{i+1}) in most cases, but sometimes is equal
to).

When constructing the final floating point value, round-off error can be
introduced in the least significant digit of the final sum. This level
of inaccuracy has no practical impact on overall model accuracy, but
it can potentially lead to loss of reproducibility with respect to the
numbers of MPI tasks and OpenMP threads. The issue is that the set of
floating point values generated from the vector of integers will vary
with the number of components in the integer vector and the implicit
exponents associated with each component, which are a functions of,
among things, the number of MPI tasks and number of OpenMP threads.
Any round-off error will be sensitive to the specific floating
point values being summed, even though the sum would be invariant with
respect to exact arithmetic. This nonreproducibility has been observed
when changing an 8-byte integer vector representation to a 4-byte
integer vector representation (which also changes the vector length
and assocated implicit exponents), but only in stand-alone contrived
examples and never in E3SM. However, we can not rule out
nonreproducibility occurring in practice (without a more detailed
analysis of the range of sums likely to occur in E3SM simulations).

The solution implemented here is to make the set of floating values
generated from the integer vector independent of the size of the
vector and definition of implicit exponents. Effectively, a new
integer vector is generated from the original in which each component
integer has a fixed number of significant digits (e.g.,
digits(1.0_r8)), and the floating point values are generated using
this vector. As number, and value, of the significant digits is
unchanged between the two representations, the implied value is
unchanged, but any rounding error in the summation from using the new
representation will be reproducible.

Since digits(1.0_r8) would be too many digts for a 4-byte integer
representation of the vector, and since we want to continue to support
this option, each new floating point value is actually contructed from
some number of floating point values with possibly fewer significant
digits, but in such a way that the summation of these is exact (no
rounding error possible), and is identical to the floating point
value that would be generated if the integer vector had been modified
as indicated in the previous paragraph.

Note that another possibility to address this issue is by further
decreasing the likelihood of rounding errors by calculating this final
sum of floating point values using the DDPDD algorithm locally. (DDPDD
is an alternative algorithm already available in the reprosum code.)
DDPDD uses a two-floating-point-value representation of the
intermediate values in the summation to approximately double the
precision, i.e., for double precision values, summing in quad
precision. In this case, round-off error is isolated to the least
significant digit in the quad precision value, and it is even less
likely that this will impact the least significant digit in the
resulting double precision value (though not impossible).

This 'local' DDPDD approach was implemented, though it is not included
in this commit as the proposed algorithm is preferred as it is always
reproducible, and not just almost always. However, comparing the two
approaches in E3SM experiments, they are BFB, implying that the
proposed algorithm improves accuracy as well. In similar
experiments, the propsoed is also BFB with using the existing
alternative 'distributed' DDPDD algorithm.

As implied, both the proposed algorithm and the two DDPDD approaches
are not BFB with the current default integer vector algorithm in E3SM
simulations. The differences arise from differing round-off in the
least significant digit, but this does accumulate over time. For
example, for a 5 day run on Chrysalis using

--compset WCYCL1850 --res ne30pg2_EC30to60E2r2 --pecount XS

(675 ATM processes) there are 3669 reprosum calls, all in ATM, and
using the proposed algorithm (or 'local' DDPDD or 'distributed' DDPDD)
for the final sum changes the result 200 times (so 5.5% of the time).

Looking at the actual difference (printing out the mantissas in the
final sums in octal), it is always +/- 1 in the least significant
binary digit. But this is still enough to perturb the model output,
e.g. (from the atm.log)

nstep, te 241   0.26199582846629281E+10   0.26199596541672482E+10   0.75730462729090879E-04   0.98528834650276956E+05

vs.

nstep, te 241   0.26199694926722941E+10   0.26199707204570022E+10   0.67893797509715502E-04   0.98528681087863835E+05

Note that the computational cost of the proposed algorithm for the
conversion of the integer vector into a floating point value is
a little more expensive than that of the original algorithm, but the
performance difference is in the noise as most of the cost of
shr_reprosum_calc is in the common aspects of the current and proposed
algorithms, i.e., determination of global max/min and number of
summands, conversion of each floating point summand into the
vector of integers representation, and the local and global sums
of the integer vectors. For example, performance comparisons between
the new and old approaches for calls to shr_reprosum_cale are
inconclusive as to which is faster. Also note that the 'local' DDPDD
algorithm is slightly more expensive than the proposed algorithm, but
the difference is again in the noise.

[Non-BFB]
The r8 values generated from the integer vector representation and to
be summed locally to generate the global sum are held in the
vector summand_vector. The integer*8 integer vector, i8_gsum_level, is
dimensioned as (-(extra_levels-1):max_level), so is length
(max_level+extral_levels). Exactly digits(1.0_r8) digits are extracted
from the integer vector representation for each r8 value, so at most

((max_level+extral_levels)*(digits(1_i8)))/digits(1.0_r8) + 1

r8 values are generated, which is bounded from above by

(max_level+extral_levels)*(1 + (digits(1_i8)/digits(1.0_r8)))

and this is how summand_vector should be dimensioned.

[BFB]
@cbegeman
Copy link
Collaborator Author

This PR will be opened in E3SM after the clean-up W&D PR E3SM-Project#5418 is merged.

@cbegeman
Copy link
Collaborator Author

@sbrus89 demonstrated that the ramp feature improves the convergence of the W&D solution for the parabolic bowl test case. The relevant results for this PR are the "standard" lines:
image

I have also found that this feature is necessary to allow W&D in ice shelf cavities (specifically the wetting phase).

@cbegeman cbegeman requested a review from sbrus89 March 31, 2023 22:43
cbegeman and others added 5 commits April 3, 2023 10:16
…Project#5508)

This PR changes the condition for a higher initial value of soil moisture so that all FATES runs, not just FATES-Hydro, will be initialized with higher soil moisture.
This is to improve the establishment of forest in bare ground simulations.
See FATES issue here NGEET/fates#994 and discussion here NGEET/fates#985.

[non-BFB] FATES only
…ct#5464)

New ocean variables for floating land ice

This PR adds new variables to represent floating land ice. This is a
necessary step toward coupling with an ice sheet model component (MALI).
* landIceMask and landIceFraction (existing variables) now represent
  both grounded and floating land ice.
* landIceFloatingMask and landIceFloatingFraction (new variables)
  represent only floating land ice.
The landIceFloating* variables are used to modulate land ice fluxes,
whereas the landIce* variables are used to mask out certain fields, as
before. landIce* variables also modulate top drag, as we want to include
the effect of top drag on thin film regions to slow flow.

[NML]
[non-BFB]
@cbegeman
Copy link
Collaborator Author

cbegeman commented Apr 5, 2023

@sbrus89 Would you like to offer any review suggestions now or should I migrate this to E3SM?

Copy link
Collaborator

@sbrus89 sbrus89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbegeman, this looks ready to move over to E3SM. Thanks for adding this feature!

amametjanov and others added 8 commits April 6, 2023 11:43
…oject#5560)

Eliminate potential nonreproducibility in final local sum

The default reprosum algorithm converts each floating point summand
into a vector of integers representation. These vectors are summed
locally first, and then globally (using an mpi_allreduce). All of this
is exact and reproducible. The final step is to convert the final sum,
in the vector of integers representation, back into a floating point
value.

This final sum begins by converting each element of the integer vector
into a floating point value, and then adding these, smallest to
largest. Round off can occur in the least significant digit of the
final result, and whether round off occurs is sensitive to the
particular floating point summands. The numbers of MPI tasks and
OpenMP threads can affect the length of the integer vector, and thus
the floating point summands.
In consequence, reproducibility with respect to degree of parallelism
is not guaranteed. (The situation in which reproducibility is lost
has never been observed in E3SM simulations, but it cannot be
ruled out.)

This PR modifies the final summation so that the same set of floating
point summands is generated independent of the number of MPI tasks
and number of OpenMP threads, establishing reproducibility.

Fixes E3SM-Project#5276

[non-BFB]
Add smaller PE-layouts for tests on Anvil to get faster queue
throughput.

[NML] - in cime_pes namelists
[non-BFB] - in mpaso.hist.am.globalStats outputs
This PR adds ifdef blocks to expose climate tuning parameters for SAM++. It is preferable to expose these as namelist variables, but in this case we don't expect SAM++ to be used for many coupled climate experiments due to the unsophisticated microphysics and prescribed aerosols. Also, exposing namelist values will require a lot more changes, so ifdefs will simplify getting the MMF coupled experiments running again.

[BFB]
…3SM-Project#5418)

Clean-up ocean wetting and drying routine

Clean-up the wetting and drying routine in MPAS-Ocean and make a few
bugfixes in that routine. This routine is not used in E3SM and only
impacts the standalone model.

[BFB]
Fix domain file name for oQU240wLI

The domain for ocn/ice grid oQU240wLI was pointing to a non-existent
file. This replaces it with the correct file.

Fixes E3SM-Project#5579

[BFB] for all currently tested configurations
@cbegeman cbegeman force-pushed the ocn/add-wetting-drying-ramp-feature branch from b33e90f to 38968b3 Compare April 6, 2023 20:41
@cbegeman
Copy link
Collaborator Author

cbegeman commented Apr 6, 2023

Migrated to E3SM-Project#5590

@cbegeman cbegeman closed this Apr 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants