Quantile methods #588

hkershaw-brown · 2023-12-05T18:57:39Z

Description:

Adds a Quantile-Conserving Ensemble Filtering Framework to DART
Publications: QCEFF part1 QCEFF part 2.

The default QCEFF options are EAKF, normal distribution (no bounds).

This is a breaking change and will be v11 of DART
Namelist item removed from filter_nml

filter_kind - now a per qty option

Two new required namelists (added to input.nml files):

&probit_transform_nml
&algorithm_info_nml

assim_tools_mod

sort_obs_inc namelist option applied to ENKF only, so default is now .true.
spread_restoration is not supported in this version

algorithm_info_mod

runtime QCEFF options read from .csv or .txt file

New probability distribution modules:

beta_distribution_mod
bnrh_distribution_mod (bounded normal rank histogram)
gamma_distribution_mod
normal_distribution_mod
probit_transform_mod
distribution_params_mod

Update to lorenz_96_tracer_advection:

positive_tracer
default mean velocity = 0)
perturbation
real(r8) not real(i8) see NAG compile findings #615

1d forward operator updates:

For non-integer powers, fix up values for negative bases

Docs:

index.rst: Nonlinear and Non-Gaussian Data Assimilation Capabilities in DART
QCEFF instructions: Quantile-Conserving Ensemble Filter Framework
Example to work through: QCEFF: Examples with the Lorenz 96 Tracer Model

Fixes issue

review notes #458
Note this pull request includes the fix for #416

Todo:

continuing existing experiments Feature Request: Using the QCF filter with existing assimilation experiments #477
Remove models/lorenz_96/work/README (readme file for papers)
remove qcf warning before merging into main: qceff_probit.rst, qceff-examples.rst, index.rst
change log add contributors (Chris Riedel)
~~After merge, create 'quantile_methods' tag (with a warning) so old readthedocs links point to docs.~~ No, edit release notes not to point at quantile_methods.
change create_obs_sequence input file for lorenz_96 tracer?
developer_test from reading table algorithm_info_nml

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update

Documentation changes needed?

My change requires a change to the documentation.
- I have updated the documentation accordingly.

Tests

Please describe any tests you ran to verify your changes.
bitwise lorenz_96 with main for default qcf options
Jeff has run many algorithm tests.
Speed tests main vs. default qcf options Derecho

Timing:

main vs. qceff default Derecho
bhrf vs main Cheyenne

Checklist for merging

Updated changelog entry
⚠️ ⚠️ ⚠️ ⚠️ 🧇
Documentation updated (remove WARNING)
- qceff_probit.rst remove 'alpha release stage' ⚠️
- qceff-examples.rst remove 'alpha release stage' ⚠️
- index.rst remove WARNING ⚠️
Update conf.py

Checklist for release

Merge into main
Create release from the main branch with appropriate tag
Delete feature-branch

Testing Datasets

Dataset needed for testing available upon request
Dataset download instructions included
No dataset needed

… values of the info routine.

…ensemble members are identical (sd is 0). The inflation, regression, and obs_increments are already bullet-proofed for the 0 sd case. In the long-term, need to worry about what happens if sd is very small as opposed to formally zero.

… and source.

Merge branch 'main' of https://github.com/NCAR/DART

i8 to i4 conversion for gfortran in model_mod. See issue #352 for more details on various argument mismatches we will need to fix for newer compilers

cice-scm2 quantile methods

…form of current observation. This code now reproduces across processor count whereas it crashed with more than on process before. Also changed the default parameters for the tracer flow part of the model and the equation for finding the velocity.

… Fixed bug in setting the filter type in algorithm_info.

…a numerical error if the input state and the two smallest ensemble members were identical but not all ensemble members were identical. Pretty rare situation in most applications.

Moved the normal distribution routines to their own module.

Newton search in gamma_distribution_mod to match the revised version in the beta mod, and added a log_normal capability in quantile_distribution_mod

…d a beta_distribution module that was developed by Chris Riedel. Added in testing code to the gamma, beta and normal distributions modules and made the algorithms more efficient. Added a higher accuracy inverse cdf for the normal module but it is switched off for now. It is much more expensive in return for improved accuracy.

…lemented yet.

Bitwise duplicates previous version.

… by 0. This also addresses the cases where ensemble members in the RH dupliate the bounds or each other in the interior. This bitwise reproduces previous version for cases that do not have duplicates but has not been tested for duplicates.

…model to include a fixed sink so that the concentration goes to 0 remote from the source. This mimics the kind of issues that are found with sea ice freezing entirely. Tests of this have run successfully with a full suite of RH methods.

…oth upper and lower bounds but were more serious for upper bound cases. This version works for long tests with a variety of filter settings for both positive and negative tracers (bounded below and above) with l96_tracer, but still needs much more comprehensive review and error checking.

a namelist parameter to turn the use of algorithm_info_mod on or off. If use_algorithm_info_mod is false, default namelist parameters control the filter as in previous versions.

…ormal in quantile_distribution_mod to try to catch errors Molly is reporint with CICE. Added use_algorithm_info namelist into perfect_model_obs and filter_mod to control whether default namelist is used or algorithm_info_mod. Have now set the default to .true. since this branch is supposed to be testing these algorithm_info_mod capabilities.

…e that this revealed the fact that gamma gamma products are not closed. If the sum of the shape parameters of the prior and the likelihood is <= 1, the posterior is not a gamma. An error message catches this, but it makes gammas a poor choice for quantities that can get close to (or exactly to) the bounds.

…rameter to flip the tracer from positive to negative to check assimilation code for bounded above quantities.

release demo.

with l96 tracer model. Removed sampling_error_correction from the input.nml to avoid the need for copying over the sampling error table file to the work directory. Sampling error correction is useful here but it's not essential. Added the readme.rst even though it's not yet rst to enable friendly testers to get started.

only typo changes, no other changes to text

johnsonbk

I fixed the typos mentioned in the comments of this review in PR six twelve. Approving the six twelve merge will reconcile the typos and will add substantive edits to index.rst.

I added the substantive edits to index.rst because the quantile methods merge adds a lot of new information to the documentation that DART users may be unfamiliar with. It may be easier to read by following a standard practice in technical writing: using topic and stress positions in sentences. The topic position in a sentence contains information that a user already knows, or is presumed to know, while the stress position introduces new information.

After running through the four examples in guide/qceff-examples.rst and plotting their output with diagnostics/matlab/plot_ens_time_series.m, my conclusion is that

Further exposition to the QCEFF examples should be added, or
The examples should be modified to make the benefits of the new framework more evident

johnsonbk · 2024-01-04T20:08:50Z

guide/qceff_probit.rst

+
+  * NORMAL_DISTRIBUTION (default)
+  * BOUNDED_NORMAL_RH_DISTRIBUTION
+  * GAMMA_DISTRIBUTION 


Does this still work when the sum of alpha and beta < 1?

guide/qceff_probit.rst

guide/qceff-examples.rst

guide/qceff_probit.rst

mjs2369 · 2024-01-04T23:52:51Z

These comments are now incorrect and should be removed since we are once again dividing by sqrt(2 PI)

https://github.com/NCAR/DART/blob/d5c21646ebef17865cfe779fd68894cbe6a51112/assimilation_code/modules/assimilation/assim_tools_mod.f90#L1701C1-L1702C48

https://github.com/NCAR/DART/blob/d5c21646ebef17865cfe779fd68894cbe6a51112/assimilation_code/modules/assimilation/assim_tools_mod.f90#L1714C1-L1715C48

mjs2369

The &probit_transform_nml and &algorithm_info_nml are missing from models/MITgcm_ocean/work/input.nml

guide/qceff_probit.rst

assimilation_code/modules/assimilation/assim_tools_mod.rst

mjs2369 · 2024-01-05T00:09:10Z

guide/qceff-examples.rst

+.. code-block:: text
+
+   git checkout quantile_methods
+


These lines above (7-15) shoud be removed or edited before release. The tools will no longer be in the alpha release stage and the user will not have to checkout the quantile_methods branch

The Quantile-Conserving Ensemble Filter Framework (QCEFF) tools are in the alpha release stage. The DART development team (dart@ucar.edu) would be happy to hear about your experiences and is anxious to build scientific collaborations using these new capabilities. To get started, make sure that you are on the quantile_methods branch of DART: .. code-block:: text git checkout quantile_methods

There is another comment like this at the very beginning of guide/qceff_probit.rst

Since this branch is the live documentation, I'll remove the warnings just before merging.
I plopped them in the checklist:

Thanks for the reminder (and future reminders so I/you/we don't miss this)

mjs2369 · 2024-01-05T00:22:08Z

assimilation_code/modules/assimilation/probit_transform_mod.f90

+
+!!!elseif(p%distribution_type == PARTICLE_FILTER_DISTRIBUTION) then
+   !!!call to_probit_particle(ens_size, state_ens, p, probit_ens, use_input_p, &
+       !!!bounded_below, bounded_above, lower_bound, upper_bound)


Are we still planning on supporting distribution_type = PARTICLE_FILTER_DISTRIBUTION in the future?

remove PARTICLE_FILTER_DISTRIBUTION from list of available distributions

DART/guide/qceff_probit.rst

Lines 123 to 132 in d5c2164

Available distributions

------------------------

* NORMAL_DISTRIBUTION (default)

* BOUNDED_NORMAL_RH_DISTRIBUTION

* GAMMA_DISTRIBUTION

* BETA_DISTRIBUTION

* LOG_NORMAL_DISTRIBUTION

* UNIFORM_DISTRIBUTION

* PARTICLE_FILTER_DISTRIBUTION

since it is not available in this code.

@mjs2369 I'm not sure when PARTICLE_FILTER_DISTRIBUTION will be supported,
I believe @jlaucar has experimented/tested with this.
For now I fixed the docs to match the code. Maybe PARTICLE_FILTER_DISTRIBUTION goes in v11.1?

Yep I agree that the PARTICLE_FILTER_DISTRIBUTION can be put into a later release if we want. Good catch with removing it from this release of the documentation.

Should we also remove it from this comment in algorithm_info_mod?

DART/assimilation_code/modules/assimilation/algorithm_info_mod.f90

Line 386 in 9e6b206

! UNIFORM_DISTRIBUTION, and PARTICLE_FILTER_DISTRIBUTION.

@mjs2369 it is not just the comment right? It is the code in algortihm_info_mod also:

DART/assimilation_code/modules/assimilation/algorithm_info_mod.f90

Lines 249 to 250 in 9e6b206

case ('PARTICLE_FILTER_DISTRIBUTION')

qceff_table_data(row)%probit_state%dist_type = PARTICLE_FILTER_DISTRIBUTION

@jlaucar do you want particle_filter_distritbution in this release?

Yes we should remove that as well. I feel like we the code would potentially take a while to be ready for this release and we should probably put it in with a later one

If we do as such, should we also remove the commented out code pertaining to it in probit_transform_mod? Lines 187-9 and the subroutine to_probit_particle starting on line 378

@mjs2369 chatted with Jeff about this.
Conclusion: don't let people select PARTICLE_FILTER_DISTRIBUTION. Leave in the commented subroutine to_probit_particle (if we were being strict about code we would remove this, but this is science). I'll go ahead and remove case ('PARTICLE_FILTER_DISTRIBUTION') algorithm_info_mod.f90. Once again thanks for the detailed review on this.

Cheers,
Helen

BKJ quantile methods edit

Co-authored-by: Marlena Smith <44214771+mjs2369@users.noreply.github.com>

hkershaw-brown · 2024-01-05T14:59:31Z

I fixed the typos mentioned in the comments of this review in PR six twelve. Approving the six twelve merge will reconcile the typos and will add substantive edits to index.rst.

I added the substantive edits to index.rst because the quantile methods merge adds a lot of new information to the documentation that DART users may be unfamiliar with. It may be easier to read by following a standard practice in technical writing: using topic and stress positions in sentences. The topic position in a sentence contains information that a user already knows, or is presumed to know, while the stress position introduces new information.

doc changes merge in from #612 pull request.

#588 (review)

hkershaw-brown · 2024-01-05T15:37:53Z

The &probit_transform_nml and &algorithm_info_nml are missing from models/MITgcm_ocean/work/input.nml

Add these in 8ed6c2e

Add for any model/input.nml that was missing these.

#!/bin/bash
files=$(find . -name input.nml)

for f in $files
do
   if ! grep -q probit_transform_nml  $f; then
     echo probit not found $f
   fi

done

for f in $files
do
  if ! grep -q algorithm_info_nml  $f; then
     echo algorithm not found $f
   fi  
done

…ions see #588 (comment) for notes

braczka

This read and worked well overall -- I agreed with Ben's main comment that the documentation could benefit from more explicit discussion of the improvement that QCEFF can bring to bounded/non-linear quantities. I made a suggestion to add a figure to the introduction, and then maybe more discussion at the end that differentiates between the performance of example A and C vs. example B.

index.rst

guide/qceff_probit.rst

guide/qceff-examples.rst

braczka · 2024-01-05T17:49:14Z

guide/qceff-examples.rst

+#. Examine the output with your favorite tools. Looking at the analysis ensemble 
+   for the tracer_concentration variables with indices near the source (location 1)
+   and far downstream from the source (location 35) is interesting. Note that the
+   source estimation capabilities of the model and filters are not being tested here.


Maybe I am too much in the weeds here, but why was the source quantity not influenced by the assimilation, and the updates restricted to the state(wind) and tracer concentration only? It seems that the source is an unobserved variable -- but maybe its not part of the Lorenz model, but a boundary condition?

guide/qceff_probit.rst

…by sqrt(2PI) in obs_increment_rank_histogram removed 'under development' & revised 2008. #588 (comment)

Co-authored-by: Brett Raczka <bmraczka@ucar.edu>

see comment #588 (comment)

see comment #588 (comment) for disccusion

…pdated Google Sheets template; Adjusting the format when logging the QCEFF table

Following the convenion of !!! to comment out this code see #588 (comment) for discusion

jlaucar and others added 30 commits November 2, 2022 15:21

Added logic as a function of oned quantities to define all the return…

0d20642

… values of the info routine.

Making perturbations consistent with stanard bounds for concentration…

18ba0bd

… and source.

Removed conflict on namelist.

3bad435

Merge branch 'main' of https://github.com/NCAR/DART

Merge branch 'main' into quantile_methods

de4bbaf

Added Molly's version of cice model_mod.

93107d8

local algorithm_info_mod.f90 for cice-scm2

792515c

i8 to i4 conversion for gfortran in model_mod. See issue #352 for more details on various argument mismatches we will need to fix for newer compilers

change error message to match subroutine

cc4943f

default of no bounds for kind - no sure if this is scienfically ok

f0bf36d

Merge pull request #1 from hkershaw-brown/quantile_methods

544f124

cice-scm2 quantile methods

Better default model parameters that lead to strong tracer gradients.…

2405017

… Fixed bug in setting the filter type in algorithm_info.

Corrected subtle bug in bounded normal to probit. This could lead to …

05298f6

…a numerical error if the input state and the two smallest ensemble members were identical but not all ensemble members were identical. Pretty rare situation in most applications.

Added a gamma distribution module and put to and from probit calls.

fc36012

Moved the normal distribution routines to their own module.

Added the required gamma definitions to the base algorithm_info_mod.

0f00d8c

Added Chris Riedel's beta distribution code in new module, revised the

4abfbff

Newton search in gamma_distribution_mod to match the revised version in the beta mod, and added a log_normal capability in quantile_distribution_mod

Adding bounds for beta and gamma. Beta is tested, gamma not fully imp…

f934f09

…lemented yet.

First phase of RH method for dealing with identical state variables.

9cbab85

Bitwise duplicates previous version.

Added ability to deal with identity obs in algorithm_info_mod. Added

2eca71f

a namelist parameter to turn the use of algorithm_info_mod on or off. If use_algorithm_info_mod is false, default namelist parameters control the filter as in previous versions.

Added namelist control for tracer parts of model. Added a namelist pa…

3ccfbfe

…rameter to flip the tracer from positive to negative to check assimilation code for bounded above quantities.

Removed state power information from algorithm_info to avoid clutter for

7fb3b12

release demo.

Merge branch 'main' into quantile_methods

55a11d4

doc: rst formatting and added to table of contents

1107ec2

only typo changes, no other changes to text

Substantive edits to improve the readability of the QCEFF introduction

dd7acb7

johnsonbk reviewed Jan 4, 2024

View reviewed changes

mjs2369 reviewed Jan 5, 2024

View reviewed changes

jlaucar approved these changes Jan 5, 2024

View reviewed changes

hkershaw-brown mentioned this pull request Jan 5, 2024

BKJ quantile methods edit #612

Merged

15 tasks

hkershaw-brown and others added 2 commits January 5, 2024 09:17

Merge pull request #612 from NCAR/bkj_quantile_methods_edit

3773782

BKJ quantile methods edit

fix typos in .rst files

893f53b

Co-authored-by: Marlena Smith <44214771+mjs2369@users.noreply.github.com>

add &probit_transform_nml &algorithm_info_nml to model input.nmls

8ed6c2e

#588 (review)

doc: removed PARTICLE_FILTER_DISTRIBUTION from the available distibut…

114fe1b

…ions see #588 (comment) for notes

braczka reviewed Jan 5, 2024

View reviewed changes

hkershaw-brown and others added 4 commits January 5, 2024 15:41

comments: removed outdated comment, since we are once again dividing …

a62c049

…by sqrt(2PI) in obs_increment_rank_histogram removed 'under development' & revised 2008. #588 (comment)

fix: real(r8) for kind not real(i8) see #615

f21a5a7

doc: add default of -8888 for bounds

aba0acc

Co-authored-by: Brett Raczka <bmraczka@ucar.edu>

doc: example tool to examine output.

9e6b206

Co-authored-by: Brett Raczka <bmraczka@ucar.edu>

This was referenced Jan 8, 2024

NAG compile findings #615

Closed

fix: format statement in assert_mod #618

Merged

hkershaw-brown and others added 8 commits January 8, 2024 14:28

doc: remove other - all state variables are updated

57652b2

see comment #588 (comment)

doc: fix description of 'Observation error information'

76202f3

see comment #588 (comment) for disccusion

doc: qceff part III pdf

11457de

Updating the headers in lorenz 96 tracer example csv files to match u…

20322de

…pdated Google Sheets template; Adjusting the format when logging the QCEFF table

Merge branch 'main' into quantile_methods

94f4354

fix: prevent a user selecting PARTICLE_FILTER_DISTRIBUTION

72599f6

Following the convenion of !!! to comment out this code see #588 (comment) for discusion

bump version, changelog, remove warnings for release

d5d4248

conf.py remove pre-release

28beb9e

hkershaw-brown merged commit 52e6e45 into main Jan 11, 2024
4 checks passed

hkershaw-brown deleted the quantile_methods branch January 11, 2024 20:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantile methods #588

Quantile methods #588

hkershaw-brown commented Dec 5, 2023 •

edited

Loading

johnsonbk left a comment

johnsonbk Jan 4, 2024

mjs2369 commented Jan 4, 2024

mjs2369 left a comment

mjs2369 Jan 5, 2024

hkershaw-brown Jan 5, 2024

mjs2369 Jan 5, 2024

hkershaw-brown Jan 5, 2024

hkershaw-brown Jan 5, 2024

mjs2369 Jan 8, 2024

hkershaw-brown Jan 8, 2024

mjs2369 Jan 8, 2024

hkershaw-brown Jan 9, 2024

hkershaw-brown commented Jan 5, 2024

hkershaw-brown commented Jan 5, 2024

braczka left a comment

braczka Jan 5, 2024

	Available distributions
	------------------------

	* NORMAL_DISTRIBUTION (default)
	* BOUNDED_NORMAL_RH_DISTRIBUTION
	* GAMMA_DISTRIBUTION
	* BETA_DISTRIBUTION
	* LOG_NORMAL_DISTRIBUTION
	* UNIFORM_DISTRIBUTION
	* PARTICLE_FILTER_DISTRIBUTION

	case ('PARTICLE_FILTER_DISTRIBUTION')
	qceff_table_data(row)%probit_state%dist_type = PARTICLE_FILTER_DISTRIBUTION

Quantile methods #588

Quantile methods #588

Conversation

hkershaw-brown commented Dec 5, 2023 • edited Loading

Description:

Fixes issue

Todo:

Types of changes

Documentation changes needed?

Tests

Timing:

Checklist for merging

Checklist for release

Testing Datasets

johnsonbk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mjs2369 commented Jan 4, 2024

mjs2369 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hkershaw-brown commented Jan 5, 2024

hkershaw-brown commented Jan 5, 2024

braczka left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hkershaw-brown commented Dec 5, 2023 •

edited

Loading