Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uncorrelation #5

Closed
wants to merge 7 commits into from
Closed

Uncorrelation #5

wants to merge 7 commits into from

Conversation

dnlbauer
Copy link

@dnlbauer dnlbauer commented Jul 31, 2021

This PR implements output for the uncorrelation function as well as an option (-i) to use correlated samples above a given threshold. I tried to make the output as closes as possible to what alchemical analysis does.

Example output for water without energy:

> ./flamel.py -p lambda_ -n dhdl
(...)
Read U_nk from lambda_36.xvg
Read U_nk from lambda_37.xvg
Uncorrelating dH/dl ...
Number of correlated and uncorrelated samples (Method=dHdl):

 State            N          N_k        N/N_k

     0          538          538         1.00
     1          538          538         1.00
     2          538          485         1.11
     3          538          431         1.25
     4          538          351         1.53
     5          538          538         1.00
     6          538          498         1.08
     7          538          492         1.09
     8          538          517         1.04
     9          538          538         1.00
    10          538          499         1.08
    11          538          538         1.00
    12          538          452         1.19
    13          538          538         1.00
    14          538          486         1.11
    15          538          459         1.17
    16          538          319         1.69
    17          538          538         1.00
    18          538          538         1.00
    19          538          440         1.22
    20          538          495         1.09
    21          538          464         1.16
    22          538          538         1.00
    23          538          538         1.00
    24          538          483         1.11
    25          538          520         1.03
    26          538          444         1.21
    27          538          375         1.43
    28          538          405         1.33
    29          538          513         1.05
    30          538          538         1.00
    31          538          461         1.17
    32          538          538         1.00
    33          538          417         1.29
    34          538          491         1.10
    35          538          538         1.00
    36          538          352         1.53
    37          538          471         1.14
Uncorrelating reduced potentials ...
Number of correlated and uncorrelated samples (Method=dHdl):

 State            N          N_k        N/N_k

     0          538          538         1.00
     1          538          538         1.00
     2          538          485         1.11
     3          538          431         1.25
     4          538          351         1.53
     5          538          538         1.00
     6          538          498         1.08
     7          538          492         1.09
     8          538          517         1.04
     9          538          538         1.00
    10          538          499         1.08
    11          538          538         1.00
    12          538          452         1.19
    13          538          538         1.00
    14          538          486         1.11
    15          538          459         1.17
    16          538          319         1.69
    17          538          538         1.00
    18          538          538         1.00
    19          538          440         1.22
    20          538          495         1.09
    21          538          464         1.16
    22          538          538         1.00
    23          538          538         1.00
    24          538          483         1.11
    25          538          520         1.03
    26          538          444         1.21
    27          538          375         1.43
    28          538          405         1.33
    29          538          513         1.05
    30          538          538         1.00
    31          538          461         1.17
    32          538          538         1.00
    33          538          417         1.29
    34          538          491         1.10
    35          538          538         1.00
    36          538          352         1.53
    37          538          471         1.14
TI: -29.154473 +- 0.240598
(...)

With threshold and single estimator:

> ./flamel.py -p lambda_ -n dhdl -i 500 -e TI
(...)
Read dH/dl from lambda_37.xvg
Uncorrelating dH/dl ...
Number of correlated and uncorrelated samples (Method=dHdl):

 State            N          N_k        N/N_k

     0          538          538         1.00
     1          538          538         1.00
     2          538          485         1.11
WARNING: Only 485 uncorrelated samples found at lambda number 2; proceeding with analysis using correlated samples...
     3          538          431         1.25
WARNING: Only 431 uncorrelated samples found at lambda number 3; proceeding with analysis using correlated samples...
     4          538          351         1.53
WARNING: Only 351 uncorrelated samples found at lambda number 4; proceeding with analysis using correlated samples...
     5          538          538         1.00
     6          538          498         1.08
WARNING: Only 498 uncorrelated samples found at lambda number 6; proceeding with analysis using correlated samples...
     7          538          492         1.09
WARNING: Only 492 uncorrelated samples found at lambda number 7; proceeding with analysis using correlated samples...
     8          538          517         1.04
     9          538          538         1.00
    10          538          499         1.08
WARNING: Only 499 uncorrelated samples found at lambda number 10; proceeding with analysis using correlated samples...
    11          538          538         1.00
    12          538          452         1.19
WARNING: Only 452 uncorrelated samples found at lambda number 12; proceeding with analysis using correlated samples...
    13          538          538         1.00
    14          538          486         1.11
WARNING: Only 486 uncorrelated samples found at lambda number 14; proceeding with analysis using correlated samples...
    15          538          459         1.17
WARNING: Only 459 uncorrelated samples found at lambda number 15; proceeding with analysis using correlated samples...
    16          538          319         1.69
WARNING: Only 319 uncorrelated samples found at lambda number 16; proceeding with analysis using correlated samples...
    17          538          538         1.00
    18          538          538         1.00
    19          538          440         1.22
WARNING: Only 440 uncorrelated samples found at lambda number 19; proceeding with analysis using correlated samples...
    20          538          495         1.09
WARNING: Only 495 uncorrelated samples found at lambda number 20; proceeding with analysis using correlated samples...
    21          538          464         1.16
WARNING: Only 464 uncorrelated samples found at lambda number 21; proceeding with analysis using correlated samples...
    22          538          538         1.00
    23          538          538         1.00
    24          538          483         1.11
WARNING: Only 483 uncorrelated samples found at lambda number 24; proceeding with analysis using correlated samples...
    25          538          520         1.03
    26          538          444         1.21
WARNING: Only 444 uncorrelated samples found at lambda number 26; proceeding with analysis using correlated samples...
    27          538          375         1.43
WARNING: Only 375 uncorrelated samples found at lambda number 27; proceeding with analysis using correlated samples...
    28          538          405         1.33
WARNING: Only 405 uncorrelated samples found at lambda number 28; proceeding with analysis using correlated samples...
    29          538          513         1.05
    30          538          538         1.00
    31          538          461         1.17
WARNING: Only 461 uncorrelated samples found at lambda number 31; proceeding with analysis using correlated samples...
    32          538          538         1.00
    33          538          417         1.29
WARNING: Only 417 uncorrelated samples found at lambda number 33; proceeding with analysis using correlated samples...
    34          538          491         1.10
WARNING: Only 491 uncorrelated samples found at lambda number 34; proceeding with analysis using correlated samples...
    35          538          538         1.00
    36          538          352         1.53
WARNING: Only 352 uncorrelated samples found at lambda number 36; proceeding with analysis using correlated samples...
    37          538          471         1.14
WARNING: Only 471 uncorrelated samples found at lambda number 37; proceeding with analysis using correlated samples...
TI: -29.167867 +- 0.229051

(...)

@@ -98,14 +99,15 @@ def main():
parser.add_argument('-q', '--suffix', dest='suffix', help='Suffix for datafile sets, i.e. \'xvg\' (default).', default='xvg')
parser.add_argument('-e', dest='estimators', type=str, default=None, help="Comma separated Estimator methods")
parser.add_argument('-n', '--uncorr', dest='uncorr', help='The observable to be used for the autocorrelation analysis; either \'dhdl_all\' (obtained as a sum over all energy components) or \'dhdl\' (obtained as a sum over those energy components that are changing; default) or \'dE\'. In the latter case the energy differences dE_{i,i+1} (dE_{i,i-1} for the last lambda) are used.', default='dhdl')
parser.add_argument('-i', '--uncorr_threshold', dest='uncorr_threshold', help='Proceed with correlated samples (N) if the number of uncorrelated samples (N_k) is found to be less than this number. If 0 is given, the time series analysis will not be performed at all. Default: 50.', default=50, type=int)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adapted from alchemical-analysis. -i sounds unintuitive to me, but thats what AA is using.

print("Uncorrelating reduced potentials ...")
u_nks = uncorrelator.uncorrelate(u_nks, args.equiltime)

# concat data for estimators
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the concatenation out of the correlation analysis functions because a threshold of 0 needs to skip correlators but still concatenates for estimators.

@dnlbauer
Copy link
Author

tagging @orbeckst and @xiki-tempula for attention :)

@orbeckst
Copy link
Member

orbeckst commented Aug 1, 2021

Thank you for the contribution. If you haven't gotten a review until, say, Wednesday, please ping us again. I'll try to make time but I also have a million other things to do. Thank you for your understanding.

Copy link
Contributor

@xiki-tempula xiki-tempula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. A few minor comments. We might need to have PEP8 checking and test eventually.

uncorrelated_df = alchemlyb.preprocessing.statistical_inefficiency(df, dhdl_sum, lower, conservative=False)
N, N_k = len(df), len(uncorrelated_df)
g = N/N_k
print("%6s %12s %12s %12.2f" % (idx, N, N_k, g))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to use the str.format() method since we are py3 only.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I tried to keep the style similar to the rest of the repo which still uses the % syntax but will happily change it.

g = N/N_k
print("%6s %12s %12s %12.2f" % (idx, N, N_k, g))
if N_k < self.uncorr_threshold:
print("WARNING: Only %d uncorrelated samples found at lambda number %d; proceeding with analysis using correlated samples..." % (N_k, idx))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to make a base class that does this check to avoid code duplication.

uncorrelated_df = alchemlyb.preprocessing.statistical_inefficiency(df, dhdl_sum, lower, conservative=False)
N, N_k = len(df), len(uncorrelated_df)
g = N/N_k
print("%6s %12s %12s %12.2f" % (idx, N, N_k, g))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to use the str.format() method since we are py3 only.

@@ -50,13 +54,22 @@ def uncorrelate(self, dfs, lower):
dl.append(dli)

uncorrelated_dfs = []
for dhdl_, l, df in zip(self.dhdls, dl, dfs):
print("Number of correlated and uncorrelated samples (Method=%s):\n\n%6s %12s %12s %12s\n" % ("dHdl", "State", "N", "N_k", "N/N_k"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to use the str.format() method since we are py3 only.

g = N/N_k
print("%6s %12s %12s %12.2f" % (idx, N, N_k, g))
if N_k < self.uncorr_threshold:
print("WARNING: Only %d uncorrelated samples found at lambda number %d; proceeding with analysis using correlated samples..." % (N_k, idx))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to use the warning instead of print the warning.

import warning
warnings.warn("Only %d uncorrelated samples found at lambda number %d")

@dnlbauer
Copy link
Author

dnlbauer commented Aug 2, 2021

done !

@dnlbauer dnlbauer closed this Aug 26, 2021
@orbeckst
Copy link
Member

@danijoo why did you close the PR?

@dnlbauer
Copy link
Author

dnlbauer commented Aug 27, 2021

@orbeckst from the discussions in #8 and alchemistry/alchemlyb#159 I was assuming that uncorrelation will not be part of flamel but alchemlyb's workflows in the future. Also, I feel like flamel will require a nearly full rewrite to adopt to the workflows once they are finished and the code in this PR will likely require a rewrite then too.

I do not see how my PRs will be merged under these circumstances and therefore closed them. Do you disagree on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants