Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept AtomGroup alongwith Universe in the DistanceMatrix #3541

Merged
merged 29 commits into from
Apr 11, 2022

Conversation

AnirG
Copy link
Contributor

@AnirG AnirG commented Feb 19, 2022

Fixes #3486

Changes made in this Pull Request:

  • Accepts AtomGroup instead of Universe in the DistanceMatrix.
  • Modified tests to account for the modified input parameter.

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

@AnirG AnirG changed the title Issue 3486 Accept AtomGroup instead of Universe in the DistanceMatrix Feb 19, 2022
@codecov
Copy link

codecov bot commented Feb 19, 2022

Codecov Report

Merging #3541 (b1dd5f4) into develop (8eda52f) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff            @@
##           develop    #3541   +/-   ##
========================================
  Coverage    94.23%   94.23%           
========================================
  Files          190      190           
  Lines        24754    24758    +4     
  Branches      3329     3330    +1     
========================================
+ Hits         23326    23330    +4     
  Misses        1380     1380           
  Partials        48       48           
Impacted Files Coverage Δ
package/MDAnalysis/analysis/diffusionmap.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8eda52f...b1dd5f4. Read the comment docs.

@@ -245,12 +246,13 @@ class DistanceMatrix(AnalysisBase):
:class:`MDAnalysis.analysis.base.Results` instance.

"""
def __init__(self, universe, select='all', metric=rmsd, cutoff=1E0-5,
def __init__(self, atomGroup, select='all', metric=rmsd, cutoff=1E0-5,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing the point of the change. Currently a Universe and selection string are both passed to construct an AtomGroup. The desired change is for this selection to happen outside of the function, so these two arguments are replaced by just an AtomGroup. To not break things abruptly, this will therefore require for a while accepting either U+selection (and warning about the change) OR AG.

Copy link
Contributor Author

@AnirG AnirG Feb 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I thought we'd have AG along with selection anyways should the need arise. I'll make the necessary changes to accept AG along with Universe and selection.

@pep8speaks
Copy link

pep8speaks commented Feb 20, 2022

Hello @AnirG! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 175:80: E501 line too long (90 > 79 characters)
Line 257:80: E501 line too long (82 > 79 characters)
Line 332:80: E501 line too long (88 > 79 characters)
Line 349:80: E501 line too long (85 > 79 characters)

Line 86:1: W293 blank line contains whitespace
Line 118:1: E302 expected 2 blank lines, found 1

Comment last updated at 2022-04-11 07:51:20 UTC

@AnirG
Copy link
Contributor Author

AnirG commented Feb 22, 2022

@richardjgowers does it seem fine? please let me know if I should add any more tests or any modification is required.

def test_distvalues_ag_universe(u, ag):
dist_universe = diffusionmap.DistanceMatrix(u, select='backbone').run()
dist_ag = diffusionmap.DistanceMatrix(ag).run()
assert_array_almost_equal(dist_universe.results.dist_matrix,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For new tests we may as well use assert_allclose if the matrices contain floats per https://numpy.org/doc/stable/reference/generated/numpy.testing.assert_array_almost_equal.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, will update it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should consider replacing assert_array_almost_equal in other places as well as per the above link. I've updated it for only this test function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this moment in time our approach is to only change tests we touch and make sure new tests use assert_allclose. Making large blanket changes makes it very hard to review PRs and also understand why changes were made based on history. If we need to do a blanket change we will do this in a specific PR in the future.

@@ -93,7 +105,7 @@ def test_long_traj(u):


def test_not_universe_error(u):
trj_only = u.trajectory
trj_only = u.universe.trajectory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you briefly explain what this change is for? The u fixture returns a universe and now we need another .universe call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, since u can also be an AtomGroup. I should've also updated match as match='U is not a Universe or AtomGroup'. I will do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tylerjereddy is correct here, u.trajectory == u.universe.trajectory, there's no need to make this change, either way you are testing what happens if you pass a trajectory through (neither a universe or an atomgroup)

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @AnirG just a couple of small additional comments (on top of the doc changes Richard asked for).

@@ -93,7 +105,7 @@ def test_long_traj(u):


def test_not_universe_error(u):
trj_only = u.trajectory
trj_only = u.universe.trajectory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tylerjereddy is correct here, u.trajectory == u.universe.trajectory, there's no need to make this change, either way you are testing what happens if you pass a trajectory through (neither a universe or an atomgroup)

@AnirG AnirG force-pushed the issue-3486 branch 2 times, most recently from 8b79261 to ab98e21 Compare March 1, 2022 12:54
@AnirG AnirG changed the title Accept AtomGroup instead of Universe in the DistanceMatrix Accept AtomGroup alongwith Universe in the DistanceMatrix Mar 1, 2022
@AnirG AnirG requested review from richardjgowers and IAlibay March 2, 2022 09:46
@AnirG
Copy link
Contributor Author

AnirG commented Mar 2, 2022

Hi, I'm not sure as to why the one codecov test failed.

@AnirG
Copy link
Contributor Author

AnirG commented Mar 3, 2022

@tylerjereddy Hi looks like even #3527 failed similar tests on macOS like I'm facing. How should that be taken care of?

@IAlibay
Copy link
Member

IAlibay commented Mar 3, 2022

This is related to #3550 and some further issues upstream - we are currently dealing with this, there is nothing to do about this in this specific PR.

@AnirG
Copy link
Contributor Author

AnirG commented Mar 5, 2022

Hi @IAlibay! Can you please suggest any more changes? I'm looking forward to taking up more issues once I get done with this one, and possibly would love to hear some general practices I might've missed here.

@AnirG
Copy link
Contributor Author

AnirG commented Mar 13, 2022

@IAlibay Hi! Do the final changes seem fine? Would make the necessary changes otherwise.

Copy link
Contributor

@jbarnoud jbarnoud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I restarted the tests as they seem to have failed on an unrelated issue. Assuming the tests do pass now, I think all the comments have been addressed appropriately. Thank you.

A couple of things are missing:

  • you need to update the CHANGELOG
  • you need to add yourself to the AUTHORS file

We hadn't release the 2.1.0 version when you last modified your pull request, so you will need to merge the develop branch or rebase against it to edit the most up to date version of these files.

Also, could you introduce yourself on the mailing list? We are discussing changes that will require us to contact all the contributors so we try to make sure we know how to reach new contributors. Thank you. (If you are interested in the Google Summer of Code, please say so explicitly; many people are introducing themselves at the moment in that context and we need to know who's a candidate and who's not 😃)

@AnirG
Copy link
Contributor Author

AnirG commented Mar 15, 2022

@jbarnoud sure, will do that.

Copy link
Contributor

@jbarnoud jbarnoud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed some tiny details during my previous readthrough. Once they are fixed, it should be good to go, assuming the other reviewers are satisfied.

@@ -171,7 +172,7 @@ class DistanceMatrix(AnalysisBase):

Parameters
----------
universe : `~MDAnalysis.core.universe.Universe`
universe : `~MDAnalysis.core.universe.Universe or ~MDAnalysis.core.groups.AtomGroup`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
universe : `~MDAnalysis.core.universe.Universe or ~MDAnalysis.core.groups.AtomGroup`
universe : `~MDAnalysis.core.universe.Universe` or `~MDAnalysis.core.groups.AtomGroup`

@@ -328,12 +330,12 @@ def __init__(self, u, epsilon=1, **kwargs):
Parameters to be passed for the initialization of a
:class:`DistanceMatrix`.
"""
if isinstance(u, Universe):
if isinstance(u, AtomGroup) or isinstance(u, Universe):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change does not appear in the docstring just above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies! I have updated the docstrings.

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of extra things, you'll need to also resolve conflicts (which I think are related to the recent 2.1.0 -> 2.2.0 change)

@@ -28,7 +28,7 @@ Deprecations

03/06/22 IAlibay, melomcr, mdd31, ianmkenney, richardjgowers, hmacdope,
orbeckst, scal444, p-j-smith, edisj, Atharva7K, ojeda-e, jbarnoud,
yangtcai
yangtcai, AnirG
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version 2.1.0 has now been released, you'll need to update your branch and add your changelog entry to 2.2.0 instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, didn't realize 2.1.0 was out already. I have revoked the changes and have updated on 2.2.0.

@@ -152,6 +152,7 @@
"""
import logging
import warnings
from MDAnalysis import AtomGroup
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imports are usually grouped as "python built-ins", "external libraries", "internal library", see the PEP8 guide for more details: https://peps.python.org/pep-0008/#source-file-encoding

Can you move this to after line 159 please?

@@ -328,12 +332,12 @@ def __init__(self, u, epsilon=1, **kwargs):
Parameters to be passed for the initialization of a
:class:`DistanceMatrix`.
"""
if isinstance(u, Universe):
if isinstance(u, AtomGroup) or isinstance(u, Universe):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question here - what happens if you pass an UpdatingAtomGroup, does the code still work?

Copy link
Contributor Author

@AnirG AnirG Mar 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it works fine. PFA example code snippet I tried on.
example.zip

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add in a test that uses an UpdatingAtomGroup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, will add a separate test for updating atomgroups.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I'm not very sure about what kind of test do we want here. Do I make a test out of the example.zip that I attached before and compare the DistMatrix constructed by updating atom group with just universe and selection string of the atom group?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IAlibay I did the above for updating AG checks. Please let me know if it is fine.

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two documentation things and a test request.

package/CHANGELOG Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/diffusionmap.py Show resolved Hide resolved
package/MDAnalysis/analysis/diffusionmap.py Outdated Show resolved Hide resolved
@@ -328,12 +332,12 @@ def __init__(self, u, epsilon=1, **kwargs):
Parameters to be passed for the initialization of a
:class:`DistanceMatrix`.
"""
if isinstance(u, Universe):
if isinstance(u, AtomGroup) or isinstance(u, Universe):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add in a test that uses an UpdatingAtomGroup?

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of small extra things. Not sure why CI isn't triggering properly at the moment.

package/MDAnalysis/analysis/diffusionmap.py Outdated Show resolved Hide resolved
Constructs an anisotropic diffusion kernel and performs eigenvalue
decomposition on it.
transform(n_eigenvectors, time)
Perform an embedding of a frame into the eigenvectors representing
the collective coordinates.

.. versionchanged: 2.2.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See here: https://mdanalysis--3541.org.readthedocs.build/en/3541/documentation_pages/analysis/diffusionmap.html#MDAnalysis.analysis.diffusionmap.DiffusionMap.transform

You'll need two empty lines for this to work as intended.

Suggested change
.. versionchanged: 2.2.0
.. versionchanged: 2.2.0

package/MDAnalysis/analysis/diffusionmap.py Outdated Show resolved Hide resolved

def test_distvalues_updating_ag(u):
selection = 'resname ALA'
ag = u.select_atoms(selection, updating=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really understanding how this checks that an updating selection works, maybe I'm missing something obvious though.

Since universes are static, there shouldn't be fewer or more atoms that match "resname ALA" within the lifetime of a trajectory. Instead of you should aim to analyse something dynamic, like "ALA residues within a given distance of residue N".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes my bad, I didn't give much attention to the atom selection. I was more concerned about if it worked when AG is an instance of UpdatingAtomGroup. I will update it such that selection is dynamic over all frames and the final selection is parsed over to DistanceMatrix. In my opinion, this test case is kinda redundant as we are checking if it's working for the final AG after all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the point of passing and UpdatingAtomGroup through and checking that everything still works as expected is because most Analysis methods in MDAnalysis use static defined numpy arrays. So if you end up with more or fewer entries than you'd expect, things can go wrong. Having a test that we can check works as expected is very important.

Copy link
Contributor Author

@AnirG AnirG Mar 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IAlibay I have a doubt here. Say, you make a selection as selection = "around 5 resname ALA". For frame 1, let's assume this selection has 10 atoms, and for frame 50, it has 20 atoms. Do you expect RMSD(frame 1, frame 50) and RMSD(frame 50, frame 1) in the DistanceMatrix to be different since the selection is being updated dynamically due to the UpdatingAtomGroup? If this is what was implied here https://github.com/MDAnalysis/mdanalysis/pull/3541/files/2799ad9cf2b9f12bf673a217985d781535f802a9#r831565393 , I believe it doesn't work this way.
What can be done is package all these dynamic selections into one atomgroup and run our DistanceMatrix analysis on this AG (finally a static selection).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IAlibay Is my understanding correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AnirG this is essentially the crux of what I was leading to with my original question on this, I was trying to make you think about the cases where a user passed in an UpdatingAtomgroup expecting that the there would be more / fewer atoms depending on the frames - would things work as expected?

Because of the way metric works, there is an expectation that the i_ref and j_ref arrays are of equal lengths (see

metric : function, optional
Maps two numpy arrays to a float, is positive definite and
symmetric. The API for a metric requires that the arrays must have
equal length, and that the function should have weights as an
optional argument. Weights give each index value its own weight for
the metric calculation over the entire arrays. Default: metric is
set to rms.rmsd().
), so as you observed above, this is unlikely to work.

It's fine in this case for the solution to just be "UpdatingAtomGroups are not accepted". Creating a complex solution involving the eventual generation of a static selection is probably not worth it.

The next steps here are then - what code changes are necessary here to avoid users passing UpdatinAtomGroups?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @lilyminium just reminded me,

self.atoms = universe.select_atoms(select)
essentially means that all UpdatingAtomGroups will be squashed to AtomGroups on class initilisation. Probably all you need here is to have a check on the input to see if it's an UpdatingAtomGroup and warn users that it is being converted to a static AtomGroup based on the current frame selection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IAlibay thanks for clarifying, I'll make the necessary changes.

@AnirG AnirG requested a review from IAlibay April 7, 2022 05:48
Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the one last thing, otherwise everything looks good to me thanks!


if isinstance(universe, UpdatingAtomGroup):
wmsg = ("U must be a static AtomGroup. Parsing an updating AtomGroup "
"will result in a static AtomGroup with the first frame "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please double check this, but I'm pretty sure it'll freeze it at the current frame, so if someone did:

u.trajectory[10]
uag = u.select_atoms('resname HOH and within 4 resname LIG', updating=True)
dmatrix = DistanceMatrix(ag)

Then you're doing uag.select_atoms('all') on the current frame, rather than the first frame.

Suggested change
"will result in a static AtomGroup with the first frame "
"will result in a static AtomGroup with the current frame "

Copy link
Contributor Author

@AnirG AnirG Apr 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. you're right. It's the current frame.

@IAlibay
Copy link
Member

IAlibay commented Apr 8, 2022

Please do fix merge conflicts too.

@IAlibay
Copy link
Member

IAlibay commented Apr 8, 2022

@jbarnoud if you can also have a re-review that'd be great

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for all the work @AnirG.

Just waiting on @jbarnoud to re-review.

Copy link
Contributor

@jbarnoud jbarnoud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'll ask for one last small test, and it is all good.

Users may provide both an AtomGroup and the select argument. From what I read, it should work well with your implementation, but I would like to see a test for it. That test does not have to be complicated, just to use both an AtomGroup and the select argument. I would write it as the same as test_distvalues_ag_universe but with

ag = u.select_atoms('protein'')
dist_ag = diffusionmap.DistanceMatrix(ag, select='backbone').run()

@jbarnoud
Copy link
Contributor

Thank you for having taken care of the conflicts as well. I'll merge as soon as the CI is done and passing :)

@AnirG
Copy link
Contributor Author

AnirG commented Apr 11, 2022

@IAlibay @jbarnoud thanks a lot for your patience and guidance :) felt quite satisfying starting out on this. Will hopefully solve more issues as I use the package more and get a better understanding over time.

Copy link
Contributor

@jbarnoud jbarnoud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@jbarnoud jbarnoud merged commit 0eb42ca into MDAnalysis:develop Apr 11, 2022
@jbarnoud
Copy link
Contributor

Congratulations! Thank you for addressing the issue :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DistanceMatrix accept AtomGroup
6 participants