Uncorrelation #5

dnlbauer · 2021-07-31T09:09:20Z

This PR implements output for the uncorrelation function as well as an option (-i) to use correlated samples above a given threshold. I tried to make the output as closes as possible to what alchemical analysis does.

Example output for water without energy:

> ./flamel.py -p lambda_ -n dhdl
(...)
Read U_nk from lambda_36.xvg
Read U_nk from lambda_37.xvg
Uncorrelating dH/dl ...
Number of correlated and uncorrelated samples (Method=dHdl):

 State            N          N_k        N/N_k

     0          538          538         1.00
     1          538          538         1.00
     2          538          485         1.11
     3          538          431         1.25
     4          538          351         1.53
     5          538          538         1.00
     6          538          498         1.08
     7          538          492         1.09
     8          538          517         1.04
     9          538          538         1.00
    10          538          499         1.08
    11          538          538         1.00
    12          538          452         1.19
    13          538          538         1.00
    14          538          486         1.11
    15          538          459         1.17
    16          538          319         1.69
    17          538          538         1.00
    18          538          538         1.00
    19          538          440         1.22
    20          538          495         1.09
    21          538          464         1.16
    22          538          538         1.00
    23          538          538         1.00
    24          538          483         1.11
    25          538          520         1.03
    26          538          444         1.21
    27          538          375         1.43
    28          538          405         1.33
    29          538          513         1.05
    30          538          538         1.00
    31          538          461         1.17
    32          538          538         1.00
    33          538          417         1.29
    34          538          491         1.10
    35          538          538         1.00
    36          538          352         1.53
    37          538          471         1.14
Uncorrelating reduced potentials ...
Number of correlated and uncorrelated samples (Method=dHdl):

 State            N          N_k        N/N_k

     0          538          538         1.00
     1          538          538         1.00
     2          538          485         1.11
     3          538          431         1.25
     4          538          351         1.53
     5          538          538         1.00
     6          538          498         1.08
     7          538          492         1.09
     8          538          517         1.04
     9          538          538         1.00
    10          538          499         1.08
    11          538          538         1.00
    12          538          452         1.19
    13          538          538         1.00
    14          538          486         1.11
    15          538          459         1.17
    16          538          319         1.69
    17          538          538         1.00
    18          538          538         1.00
    19          538          440         1.22
    20          538          495         1.09
    21          538          464         1.16
    22          538          538         1.00
    23          538          538         1.00
    24          538          483         1.11
    25          538          520         1.03
    26          538          444         1.21
    27          538          375         1.43
    28          538          405         1.33
    29          538          513         1.05
    30          538          538         1.00
    31          538          461         1.17
    32          538          538         1.00
    33          538          417         1.29
    34          538          491         1.10
    35          538          538         1.00
    36          538          352         1.53
    37          538          471         1.14
TI: -29.154473 +- 0.240598
(...)

With threshold and single estimator:

> ./flamel.py -p lambda_ -n dhdl -i 500 -e TI
(...)
Read dH/dl from lambda_37.xvg
Uncorrelating dH/dl ...
Number of correlated and uncorrelated samples (Method=dHdl):

 State            N          N_k        N/N_k

     0          538          538         1.00
     1          538          538         1.00
     2          538          485         1.11
WARNING: Only 485 uncorrelated samples found at lambda number 2; proceeding with analysis using correlated samples...
     3          538          431         1.25
WARNING: Only 431 uncorrelated samples found at lambda number 3; proceeding with analysis using correlated samples...
     4          538          351         1.53
WARNING: Only 351 uncorrelated samples found at lambda number 4; proceeding with analysis using correlated samples...
     5          538          538         1.00
     6          538          498         1.08
WARNING: Only 498 uncorrelated samples found at lambda number 6; proceeding with analysis using correlated samples...
     7          538          492         1.09
WARNING: Only 492 uncorrelated samples found at lambda number 7; proceeding with analysis using correlated samples...
     8          538          517         1.04
     9          538          538         1.00
    10          538          499         1.08
WARNING: Only 499 uncorrelated samples found at lambda number 10; proceeding with analysis using correlated samples...
    11          538          538         1.00
    12          538          452         1.19
WARNING: Only 452 uncorrelated samples found at lambda number 12; proceeding with analysis using correlated samples...
    13          538          538         1.00
    14          538          486         1.11
WARNING: Only 486 uncorrelated samples found at lambda number 14; proceeding with analysis using correlated samples...
    15          538          459         1.17
WARNING: Only 459 uncorrelated samples found at lambda number 15; proceeding with analysis using correlated samples...
    16          538          319         1.69
WARNING: Only 319 uncorrelated samples found at lambda number 16; proceeding with analysis using correlated samples...
    17          538          538         1.00
    18          538          538         1.00
    19          538          440         1.22
WARNING: Only 440 uncorrelated samples found at lambda number 19; proceeding with analysis using correlated samples...
    20          538          495         1.09
WARNING: Only 495 uncorrelated samples found at lambda number 20; proceeding with analysis using correlated samples...
    21          538          464         1.16
WARNING: Only 464 uncorrelated samples found at lambda number 21; proceeding with analysis using correlated samples...
    22          538          538         1.00
    23          538          538         1.00
    24          538          483         1.11
WARNING: Only 483 uncorrelated samples found at lambda number 24; proceeding with analysis using correlated samples...
    25          538          520         1.03
    26          538          444         1.21
WARNING: Only 444 uncorrelated samples found at lambda number 26; proceeding with analysis using correlated samples...
    27          538          375         1.43
WARNING: Only 375 uncorrelated samples found at lambda number 27; proceeding with analysis using correlated samples...
    28          538          405         1.33
WARNING: Only 405 uncorrelated samples found at lambda number 28; proceeding with analysis using correlated samples...
    29          538          513         1.05
    30          538          538         1.00
    31          538          461         1.17
WARNING: Only 461 uncorrelated samples found at lambda number 31; proceeding with analysis using correlated samples...
    32          538          538         1.00
    33          538          417         1.29
WARNING: Only 417 uncorrelated samples found at lambda number 33; proceeding with analysis using correlated samples...
    34          538          491         1.10
WARNING: Only 491 uncorrelated samples found at lambda number 34; proceeding with analysis using correlated samples...
    35          538          538         1.00
    36          538          352         1.53
WARNING: Only 352 uncorrelated samples found at lambda number 36; proceeding with analysis using correlated samples...
    37          538          471         1.14
WARNING: Only 471 uncorrelated samples found at lambda number 37; proceeding with analysis using correlated samples...
TI: -29.167867 +- 0.229051

(...)

dnlbauer · 2021-07-31T09:10:23Z

flamel.py

@@ -98,14 +99,15 @@ def main():
    parser.add_argument('-q', '--suffix', dest='suffix', help='Suffix for datafile sets, i.e. \'xvg\' (default).', default='xvg')
    parser.add_argument('-e', dest='estimators', type=str, default=None, help="Comma separated Estimator methods")
    parser.add_argument('-n', '--uncorr', dest='uncorr', help='The observable to be used for the autocorrelation analysis; either \'dhdl_all\' (obtained as a sum over all energy components) or \'dhdl\' (obtained as a sum over those energy components that are changing; default) or \'dE\'. In the latter case the energy differences dE_{i,i+1} (dE_{i,i-1} for the last lambda) are used.', default='dhdl')
+    parser.add_argument('-i', '--uncorr_threshold', dest='uncorr_threshold', help='Proceed with correlated samples (N) if the number of uncorrelated samples (N_k) is found to be less than this number. If 0 is given, the time series analysis will not be performed at all. Default: 50.', default=50, type=int)


adapted from alchemical-analysis. -i sounds unintuitive to me, but thats what AA is using.

dnlbauer · 2021-07-31T09:11:54Z

flamel.py

+            print("Uncorrelating reduced potentials ...")
+            u_nks = uncorrelator.uncorrelate(u_nks, args.equiltime)
+
+    # concat data for estimators


I moved the concatenation out of the correlation analysis functions because a threshold of 0 needs to skip correlators but still concatenates for estimators.

dnlbauer · 2021-07-31T09:14:56Z

tagging @orbeckst and @xiki-tempula for attention :)

orbeckst · 2021-08-01T02:26:04Z

Thank you for the contribution. If you haven't gotten a review until, say, Wednesday, please ping us again. I'll try to make time but I also have a million other things to do. Thank you for your understanding.

xiki-tempula

Looking good. A few minor comments. We might need to have PEP8 checking and test eventually.

xiki-tempula · 2021-08-01T21:11:41Z

uncorrelate/statistical_inefficiency_dhdl_all.py

+            uncorrelated_df = alchemlyb.preprocessing.statistical_inefficiency(df, dhdl_sum, lower, conservative=False)
+            N, N_k = len(df), len(uncorrelated_df)
+            g = N/N_k
+            print("%6s %12s %12s %12.2f" % (idx, N, N_k, g))


Might be good to use the str.format() method since we are py3 only.

Good point. I tried to keep the style similar to the rest of the repo which still uses the % syntax but will happily change it.

xiki-tempula · 2021-08-01T21:18:40Z

uncorrelate/statistical_inefficiency_dhdl_all.py

+            g = N/N_k
+            print("%6s %12s %12s %12.2f" % (idx, N, N_k, g))
+            if N_k < self.uncorr_threshold:
+                print("WARNING: Only %d uncorrelated samples found at lambda number %d; proceeding with analysis using correlated samples..." % (N_k, idx))


Might be good to make a base class that does this check to avoid code duplication.

xiki-tempula · 2021-08-01T21:20:27Z

uncorrelate/statistical_inefficiency_dhdl.py

+            uncorrelated_df = alchemlyb.preprocessing.statistical_inefficiency(df, dhdl_sum, lower, conservative=False)
+            N, N_k = len(df), len(uncorrelated_df)
+            g = N/N_k
+            print("%6s %12s %12s %12.2f" % (idx, N, N_k, g))


Might be good to use the str.format() method since we are py3 only.

xiki-tempula · 2021-08-01T21:20:33Z

uncorrelate/statistical_inefficiency_dhdl.py

@@ -50,13 +54,22 @@ def uncorrelate(self, dfs, lower):
            dl.append(dli)

        uncorrelated_dfs = []
-        for dhdl_, l, df in zip(self.dhdls, dl, dfs):
+        print("Number of correlated and uncorrelated samples (Method=%s):\n\n%6s %12s %12s %12s\n" % ("dHdl", "State", "N", "N_k", "N/N_k"))


Might be good to use the str.format() method since we are py3 only.

xiki-tempula · 2021-08-01T21:21:02Z

uncorrelate/statistical_inefficiency_dhdl.py

+            g = N/N_k
+            print("%6s %12s %12s %12.2f" % (idx, N, N_k, g))
+            if N_k < self.uncorr_threshold:
+                print("WARNING: Only %d uncorrelated samples found at lambda number %d; proceeding with analysis using correlated samples..." % (N_k, idx))


It might be better to use the warning instead of print the warning.

import warning warnings.warn("Only %d uncorrelated samples found at lambda number %d")

dnlbauer · 2021-08-02T06:34:31Z

done !

orbeckst · 2021-08-27T17:00:14Z

@danijoo why did you close the PR?

dnlbauer · 2021-08-27T18:18:10Z

@orbeckst from the discussions in #8 and alchemistry/alchemlyb#159 I was assuming that uncorrelation will not be part of flamel but alchemlyb's workflows in the future. Also, I feel like flamel will require a nearly full rewrite to adopt to the workflows once they are finished and the code in this PR will likely require a rewrite then too.

I do not see how my PRs will be merged under these circumstances and therefore closed them. Do you disagree on this?

daniel added 4 commits July 31, 2021 10:45

output for correlation analysis

7a89b15

skip uncorrelation if threshold is 0

970bef2

uncorrelation thresholds for dhdl correlators

d7840cb

update readme

efe0884

dnlbauer commented Jul 31, 2021

View reviewed changes

xiki-tempula requested changes Aug 1, 2021

View reviewed changes

dnlbauer added 3 commits August 2, 2021 08:05

python 3 string formatting

3b4cbe9

warnings use warnings.warn

5c41cc2

baseclass for statistical inefficiency

639a5d5

dnlbauer closed this Aug 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uncorrelation #5

Uncorrelation #5

dnlbauer commented Jul 31, 2021 •

edited

Loading

dnlbauer Jul 31, 2021

dnlbauer Jul 31, 2021

dnlbauer commented Jul 31, 2021

orbeckst commented Aug 1, 2021

xiki-tempula left a comment

xiki-tempula Aug 1, 2021

dnlbauer Aug 2, 2021

xiki-tempula Aug 1, 2021

xiki-tempula Aug 1, 2021

xiki-tempula Aug 1, 2021

xiki-tempula Aug 1, 2021

dnlbauer commented Aug 2, 2021

orbeckst commented Aug 27, 2021

dnlbauer commented Aug 27, 2021 •

edited

Loading

Uncorrelation #5

Uncorrelation #5

Conversation

dnlbauer commented Jul 31, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dnlbauer commented Jul 31, 2021

orbeckst commented Aug 1, 2021

xiki-tempula left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dnlbauer commented Aug 2, 2021

orbeckst commented Aug 27, 2021

dnlbauer commented Aug 27, 2021 • edited Loading

dnlbauer commented Jul 31, 2021 •

edited

Loading

dnlbauer commented Aug 27, 2021 •

edited

Loading