-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uncorrelation #5
Conversation
@@ -98,14 +99,15 @@ def main(): | |||
parser.add_argument('-q', '--suffix', dest='suffix', help='Suffix for datafile sets, i.e. \'xvg\' (default).', default='xvg') | |||
parser.add_argument('-e', dest='estimators', type=str, default=None, help="Comma separated Estimator methods") | |||
parser.add_argument('-n', '--uncorr', dest='uncorr', help='The observable to be used for the autocorrelation analysis; either \'dhdl_all\' (obtained as a sum over all energy components) or \'dhdl\' (obtained as a sum over those energy components that are changing; default) or \'dE\'. In the latter case the energy differences dE_{i,i+1} (dE_{i,i-1} for the last lambda) are used.', default='dhdl') | |||
parser.add_argument('-i', '--uncorr_threshold', dest='uncorr_threshold', help='Proceed with correlated samples (N) if the number of uncorrelated samples (N_k) is found to be less than this number. If 0 is given, the time series analysis will not be performed at all. Default: 50.', default=50, type=int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adapted from alchemical-analysis. -i sounds unintuitive to me, but thats what AA is using.
print("Uncorrelating reduced potentials ...") | ||
u_nks = uncorrelator.uncorrelate(u_nks, args.equiltime) | ||
|
||
# concat data for estimators |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the concatenation out of the correlation analysis functions because a threshold of 0 needs to skip correlators but still concatenates for estimators.
tagging @orbeckst and @xiki-tempula for attention :) |
Thank you for the contribution. If you haven't gotten a review until, say, Wednesday, please ping us again. I'll try to make time but I also have a million other things to do. Thank you for your understanding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. A few minor comments. We might need to have PEP8 checking and test eventually.
uncorrelated_df = alchemlyb.preprocessing.statistical_inefficiency(df, dhdl_sum, lower, conservative=False) | ||
N, N_k = len(df), len(uncorrelated_df) | ||
g = N/N_k | ||
print("%6s %12s %12s %12.2f" % (idx, N, N_k, g)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to use the str.format() method since we are py3 only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I tried to keep the style similar to the rest of the repo which still uses the % syntax but will happily change it.
g = N/N_k | ||
print("%6s %12s %12s %12.2f" % (idx, N, N_k, g)) | ||
if N_k < self.uncorr_threshold: | ||
print("WARNING: Only %d uncorrelated samples found at lambda number %d; proceeding with analysis using correlated samples..." % (N_k, idx)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to make a base class that does this check to avoid code duplication.
uncorrelated_df = alchemlyb.preprocessing.statistical_inefficiency(df, dhdl_sum, lower, conservative=False) | ||
N, N_k = len(df), len(uncorrelated_df) | ||
g = N/N_k | ||
print("%6s %12s %12s %12.2f" % (idx, N, N_k, g)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to use the str.format() method since we are py3 only.
@@ -50,13 +54,22 @@ def uncorrelate(self, dfs, lower): | |||
dl.append(dli) | |||
|
|||
uncorrelated_dfs = [] | |||
for dhdl_, l, df in zip(self.dhdls, dl, dfs): | |||
print("Number of correlated and uncorrelated samples (Method=%s):\n\n%6s %12s %12s %12s\n" % ("dHdl", "State", "N", "N_k", "N/N_k")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to use the str.format() method since we are py3 only.
g = N/N_k | ||
print("%6s %12s %12s %12.2f" % (idx, N, N_k, g)) | ||
if N_k < self.uncorr_threshold: | ||
print("WARNING: Only %d uncorrelated samples found at lambda number %d; proceeding with analysis using correlated samples..." % (N_k, idx)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to use the warning instead of print the warning.
import warning
warnings.warn("Only %d uncorrelated samples found at lambda number %d")
done ! |
@danijoo why did you close the PR? |
@orbeckst from the discussions in #8 and alchemistry/alchemlyb#159 I was assuming that uncorrelation will not be part of flamel but alchemlyb's workflows in the future. Also, I feel like flamel will require a nearly full rewrite to adopt to the workflows once they are finished and the code in this PR will likely require a rewrite then too. I do not see how my PRs will be merged under these circumstances and therefore closed them. Do you disagree on this? |
This PR implements output for the uncorrelation function as well as an option (-i) to use correlated samples above a given threshold. I tried to make the output as closes as possible to what alchemical analysis does.
Example output for water without energy:
With threshold and single estimator: