-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ocean wetting-and-drying ramp feature #46
Add ocean wetting-and-drying ramp feature #46
Conversation
The reprosum algorithm converts each floating point summand into a vector of integers representation. These vectors are summed locally first, and then globally (using an mpi_allreduce). All of this is exact and reproducible. The final step is to convert the final sum, in the vector of integers representation, back into a floating point value. For this last step, each component integer in the vector representation is converted back into a floating point value (call it a_{i} for the ith element of the vector), and then these values are summed locally, smallest to largest, to get the final value. The integer representation is first massaged so that all a_{i} have the same sign and the range of values is usually not overlapping (least significant binary digit in abs(a_{i}) represents a value larger than that represented by abs(a_{i+1}) in most cases, but sometimes is equal to). When constructing the final floating point value, round-off error can be introduced in the least significant digit of the final sum. This level of inaccuracy has no practical impact on overall model accuracy, but it can potentially lead to loss of reproducibility with respect to the numbers of MPI tasks and OpenMP threads. The issue is that the set of floating point values generated from the vector of integers will vary with the number of components in the integer vector and the implicit exponents associated with each component, which are a functions of, among things, the number of MPI tasks and number of OpenMP threads. Any round-off error will be sensitive to the specific floating point values being summed, even though the sum would be invariant with respect to exact arithmetic. This nonreproducibility has been observed when changing an 8-byte integer vector representation to a 4-byte integer vector representation (which also changes the vector length and assocated implicit exponents), but only in stand-alone contrived examples and never in E3SM. However, we can not rule out nonreproducibility occurring in practice (without a more detailed analysis of the range of sums likely to occur in E3SM simulations). The solution implemented here is to make the set of floating values generated from the integer vector independent of the size of the vector and definition of implicit exponents. Effectively, a new integer vector is generated from the original in which each component integer has a fixed number of significant digits (e.g., digits(1.0_r8)), and the floating point values are generated using this vector. As number, and value, of the significant digits is unchanged between the two representations, the implied value is unchanged, but any rounding error in the summation from using the new representation will be reproducible. Since digits(1.0_r8) would be too many digts for a 4-byte integer representation of the vector, and since we want to continue to support this option, each new floating point value is actually contructed from some number of floating point values with possibly fewer significant digits, but in such a way that the summation of these is exact (no rounding error possible), and is identical to the floating point value that would be generated if the integer vector had been modified as indicated in the previous paragraph. Note that another possibility to address this issue is by further decreasing the likelihood of rounding errors by calculating this final sum of floating point values using the DDPDD algorithm locally. (DDPDD is an alternative algorithm already available in the reprosum code.) DDPDD uses a two-floating-point-value representation of the intermediate values in the summation to approximately double the precision, i.e., for double precision values, summing in quad precision. In this case, round-off error is isolated to the least significant digit in the quad precision value, and it is even less likely that this will impact the least significant digit in the resulting double precision value (though not impossible). This 'local' DDPDD approach was implemented, though it is not included in this commit as the proposed algorithm is preferred as it is always reproducible, and not just almost always. However, comparing the two approaches in E3SM experiments, they are BFB, implying that the proposed algorithm improves accuracy as well. In similar experiments, the propsoed is also BFB with using the existing alternative 'distributed' DDPDD algorithm. As implied, both the proposed algorithm and the two DDPDD approaches are not BFB with the current default integer vector algorithm in E3SM simulations. The differences arise from differing round-off in the least significant digit, but this does accumulate over time. For example, for a 5 day run on Chrysalis using --compset WCYCL1850 --res ne30pg2_EC30to60E2r2 --pecount XS (675 ATM processes) there are 3669 reprosum calls, all in ATM, and using the proposed algorithm (or 'local' DDPDD or 'distributed' DDPDD) for the final sum changes the result 200 times (so 5.5% of the time). Looking at the actual difference (printing out the mantissas in the final sums in octal), it is always +/- 1 in the least significant binary digit. But this is still enough to perturb the model output, e.g. (from the atm.log) nstep, te 241 0.26199582846629281E+10 0.26199596541672482E+10 0.75730462729090879E-04 0.98528834650276956E+05 vs. nstep, te 241 0.26199694926722941E+10 0.26199707204570022E+10 0.67893797509715502E-04 0.98528681087863835E+05 Note that the computational cost of the proposed algorithm for the conversion of the integer vector into a floating point value is a little more expensive than that of the original algorithm, but the performance difference is in the noise as most of the cost of shr_reprosum_calc is in the common aspects of the current and proposed algorithms, i.e., determination of global max/min and number of summands, conversion of each floating point summand into the vector of integers representation, and the local and global sums of the integer vectors. For example, performance comparisons between the new and old approaches for calls to shr_reprosum_cale are inconclusive as to which is faster. Also note that the 'local' DDPDD algorithm is slightly more expensive than the proposed algorithm, but the difference is again in the noise. [Non-BFB]
The r8 values generated from the integer vector representation and to be summed locally to generate the global sum are held in the vector summand_vector. The integer*8 integer vector, i8_gsum_level, is dimensioned as (-(extra_levels-1):max_level), so is length (max_level+extral_levels). Exactly digits(1.0_r8) digits are extracted from the integer vector representation for each r8 value, so at most ((max_level+extral_levels)*(digits(1_i8)))/digits(1.0_r8) + 1 r8 values are generated, which is bounded from above by (max_level+extral_levels)*(1 + (digits(1_i8)/digits(1.0_r8))) and this is how summand_vector should be dimensioned. [BFB]
This PR will be opened in E3SM after the clean-up W&D PR E3SM-Project#5418 is merged. |
@sbrus89 demonstrated that the ramp feature improves the convergence of the W&D solution for the parabolic bowl test case. The relevant results for this PR are the "standard" lines: I have also found that this feature is necessary to allow W&D in ice shelf cavities (specifically the wetting phase). |
The new initial conditions have: * `landIceFloatingMask=landIceMask` * `landIceFloatingFraction=landIceFraction`
We want `config_land_ice_flux_mode` to be `off` for most meshes but to be one of: * `pressure_only` (default) * `standalone` if `$OCN_ISMF = internal` * `coupled` if `$OCN_ISMF = coupled`
…Project#5508) This PR changes the condition for a higher initial value of soil moisture so that all FATES runs, not just FATES-Hydro, will be initialized with higher soil moisture. This is to improve the establishment of forest in bare ground simulations. See FATES issue here NGEET/fates#994 and discussion here NGEET/fates#985. [non-BFB] FATES only
…ct#5464) New ocean variables for floating land ice This PR adds new variables to represent floating land ice. This is a necessary step toward coupling with an ice sheet model component (MALI). * landIceMask and landIceFraction (existing variables) now represent both grounded and floating land ice. * landIceFloatingMask and landIceFloatingFraction (new variables) represent only floating land ice. The landIceFloating* variables are used to modulate land ice fluxes, whereas the landIce* variables are used to mask out certain fields, as before. landIce* variables also modulate top drag, as we want to include the effect of top drag on thin film regions to slow flow. [NML] [non-BFB]
@sbrus89 Would you like to offer any review suggestions now or should I migrate this to E3SM? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cbegeman, this looks ready to move over to E3SM. Thanks for adding this feature!
…oject#5560) Eliminate potential nonreproducibility in final local sum The default reprosum algorithm converts each floating point summand into a vector of integers representation. These vectors are summed locally first, and then globally (using an mpi_allreduce). All of this is exact and reproducible. The final step is to convert the final sum, in the vector of integers representation, back into a floating point value. This final sum begins by converting each element of the integer vector into a floating point value, and then adding these, smallest to largest. Round off can occur in the least significant digit of the final result, and whether round off occurs is sensitive to the particular floating point summands. The numbers of MPI tasks and OpenMP threads can affect the length of the integer vector, and thus the floating point summands. In consequence, reproducibility with respect to degree of parallelism is not guaranteed. (The situation in which reproducibility is lost has never been observed in E3SM simulations, but it cannot be ruled out.) This PR modifies the final summation so that the same set of floating point summands is generated independent of the number of MPI tasks and number of OpenMP threads, establishing reproducibility. Fixes E3SM-Project#5276 [non-BFB]
Add smaller PE-layouts for tests on Anvil to get faster queue throughput. [NML] - in cime_pes namelists [non-BFB] - in mpaso.hist.am.globalStats outputs
This PR adds ifdef blocks to expose climate tuning parameters for SAM++. It is preferable to expose these as namelist variables, but in this case we don't expect SAM++ to be used for many coupled climate experiments due to the unsophisticated microphysics and prescribed aerosols. Also, exposing namelist values will require a lot more changes, so ifdefs will simplify getting the MMF coupled experiments running again. [BFB]
…3SM-Project#5418) Clean-up ocean wetting and drying routine Clean-up the wetting and drying routine in MPAS-Ocean and make a few bugfixes in that routine. This routine is not used in E3SM and only impacts the standalone model. [BFB]
Fix domain file name for oQU240wLI The domain for ocn/ice grid oQU240wLI was pointing to a non-existent file. This replaces it with the correct file. Fixes E3SM-Project#5579 [BFB] for all currently tested configurations
b33e90f
to
38968b3
Compare
Migrated to E3SM-Project#5590 |
This PR enhances the existing wetting-and-drying algorithm using an approach from O'Dea et al. 2020 (doi:10.1016/j.ocemod.2020.101708). Instead of zeroing out
normalVelocity
andnormalVelocity
tendencies in cells where the projectedlayerThickness
drops below the user-defined minimum, this method ramps down thenormalVelocity
and its tendencies between a range oflayerThickness
es.[BFB][stealth]