Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterative Average structures #2039

Closed
kain88-de opened this issue Aug 10, 2018 · 4 comments · Fixed by #4524
Closed

Iterative Average structures #2039

kain88-de opened this issue Aug 10, 2018 · 4 comments · Fixed by #4524

Comments

@kain88-de
Copy link
Member

kain88-de commented Aug 10, 2018

Is your feature request related to a problem? Please describe.

In a recent paper of mine, I describe an iterative algorithm to calculate an average structure from simulations. Below is the description.

As a reference structure, I used the average structure in the simulations, which for B-DNA was
obtained by RMSD aligning the inner five base pairs and averaging the simulation structures iteratively to convergence. The first frame was used for the initial alignment.

I would expect that this algorithm was already previously described. If anyone knows an earlier paper please add it to this issue so we can properly cite it. I currently do not have time to include this properly into MDAnalysis.

Note: I find this a good example to show how the changes of our analysis module in the last two years allow it to quickly write up a new analysis type that is easy to reuse and share.

Describe the solution you'd like

This is the complete implementation I used including tests. I'm posting it here should anyone be interested in including this into the align module of MDAnalysis. If anyone wants to just use this please consider to cite my paper when you do. The code can be used under the same license that MDAnalysis uses.

class AverageStructure(base.AnalysisBase):
    """Calculate an average structure after RMSD alignment to a reference."""

    def __init__(self, mobile, ref, weights=None, calc_msf=True, **kwargs):
        """init function

        Parameters
        ----------
        mobile : mda.AtomGroup
            atomgroup that is being fitted
        ref : mda.AtomGroup
            reference structure
        weights : str, array_like (optional)
            weights that can be used. If `None` use equal weights, if `'mass'`
            use masses of ref as weights or give an array of arbitrary weights.
        calc_msf : bool (optional)
            calc mean square fluctuation
        kwargs : dict
            options for AnalysisBase to adjust iteration behavior

        """
        super(AverageStructure, self).__init__(mobile.universe.trajectory,
                                               **kwargs)
        self.mobile = mobile
        self.ref_xyz = ref.positions.copy() - ref.centroid()
        self.weights = weights
        self.calc_msf = calc_msf

    def _prepare(self):
        self.structure = np.zeros(self.mobile.positions.shape)
        self.rmsd = 0
        if self.calc_msf:
            self.msf = np.zeros(self.mobile.n_atoms)

    def _single_frame(self):
        self.mobile.translate(-self.mobile.center(self.weights))
        self.rmsd += align._fit_to(
            self.mobile.positions,
            self.ref_xyz,
            self.mobile,
            weights=self.weights,
            mobile_com=np.zeros(3),
            ref_com=np.zeros(3))[1]
        self.structure += self.mobile.positions
        if self.calc_msf:
            self.msf += np.sum(
                (self.ref_xyz - self.mobile.positions)**2, axis=1)

    def _conclude(self):
        self.structure /= self.n_frames
        self.rmsd /= self.n_frames
        if self.calc_msf:
            self.msf /= self.n_frames


def iterative_average(mobile,
                      ref,
                      weights=None,
                      niter=100,
                      eps=1e-8,
                      verbose=False,
                      **kwargs):
    """Calculate an optimal reference that is also the average structure after an
    RMSD alignment. This is done with an iterative algorithm where the previous
    average structure is used as a new reference.

    Parameters
    ----------
    mobile : mda.AtomGroup
        mobile atomgroup to find the average for
    ref : mda.AtomGroup
        initial reference structure. Positions are changed by this function!
    weights : str, array_like (optional)
        weights that can be used. If `None` use equal weights, if `'mass'` use masses of ref as weights or give an array of arbitrary weights.
    niter : int (optional)
        maximum number of iterations
    eps : float (optional)
        RMSD distance at which reference and average are assumed to be equal
    verbose : bool (optional)
        verbosity
    **kwargs : dict (optional)
        AverageStructure kwargs

    Returns
    -------
    res : AverageStructure
        AverageStructure result from the last iteration

    """
    drmsd = np.inf
    for i in range(niter):
        # found a converged structure
        if drmsd < eps:
            break

        res = AverageStructure(mobile, ref, weights, **kwargs).run()
        drmsd = mda.analysis.rms.rmsd(ref.positions, res.structure, weights=weights)
        ref.positions = res.structure

        if verbose:
            print("i = {}, rmsd-change = {:.2f}, ave-rmsd = {:.2f}, ave-msf = {:.2f}".
                  format(i, drmsd, res.rmsd, res.msf.mean()))

    return res

Some reference tests I have. They only check for regressions. Correctness was only validated visually so don't put too much trust into these tests.

from __future__ import absolute_import

import MDAnalysis as mda
from MDAnalysisTests.datafiles import PSF, DCD

import pytest
from numpy.testing import assert_array_almost_equal

# module contains iterative average implementation
import metric


@pytest.fixture
def mobile():
    u = mda.Universe(PSF, DCD)
    u.transfer_to_memory()
    return u.select_atoms('bynum 1:10')


@pytest.fixture
def reference():
    u = mda.Universe(PSF, DCD)
    return u.select_atoms('bynum 1:10')


# Pure regression test
def test_AverageStructure(mobile, reference):
    res = metric.AverageStructure(mobile, reference, stop=10).run()
    assert_array_almost_equal(res.structure,
                              [[0.54083995, -0.42515443, -1.06207113],
                               [1.07435689, -1.16633342, -1.31354781],
                               [1.03657782, 0.31960404, -1.12080256],
                               [-0.15218723, -0.3288244, -1.698552],
                               [0.08886168, -0.68049069, 0.28910915],
                               [0.92952701, -0.77678878, 0.83458763],
                               [-0.7420783, 0.4927999, 0.85065659],
                               [-1.48584671, 0.17381404, 1.22256903],
                               [-1.06299627, 1.13438787, 0.23116275],
                               [-0.22705485, 1.25698588, 1.76688837]], decimal=5)


# Pure regression test
def test_iterative_average(mobile, reference):
    res = metric.iterative_average(mobile, reference, stop=10)
    assert_array_almost_equal(res.structure,
                              [[0.54418358, -0.42490685, -1.06077607],
                               [1.08109621, -1.18605215, -1.30346445],
                               [1.05210044, 0.32313582, -1.11259767],
                               [-0.16157775, -0.3265537, -1.70656559],
                               [0.08676427, -0.66799566, 0.28346055],
                               [0.91055919, -0.76166723, 0.8188435],
                               [-0.74196764, 0.49133729, 0.85177611],
                               [-1.48156064, 0.15684868, 1.23166217],
                               [-1.0767725, 1.13997174, 0.22508162],
                               [-0.21282517, 1.25588206, 1.77257982]], decimal=5)

edit kain88-de: Removed class unrelated to this feature from example code.

@orbeckst
Copy link
Member

The paper can be added to the class as well, using duecredit and its DOI 10.1021/acs.jpcb.7b11988.

orbeckst pushed a commit that referenced this issue Jan 20, 2020
* added align.AverageStructure
* added more tests and example
* related to #2039
@kain88-de
Copy link
Member Author

with the implementation of #2430 the iterative algorithm can be implemented using only

def iterative_average(mobile,
                      ref,
                      weights=None,
                      niter=100,
                      eps=1e-8,
                      verbose=False,
                      **kwargs):
    """Calculate an optimal reference that is also the average structure after an
    RMSD alignment. The optional reference is defined as average structure of a trajectory, with the optimal reference used as input. This function computes the optimal reference by using a starting reference for the average structure, which is used to calculate the average structure again. This is repeated until the reference structure has converged.

    Parameters
    ----------
    mobile : mda.AtomGroup
        mobile atomgroup to find the average for
    ref : mda.AtomGroup
        initial reference structure. Positions are changed by this function!
    weights : str, array_like (optional)
        weights that can be used. If `None` use equal weights, if `'mass'` use masses of ref as weights or give an array of arbitrary weights.
    niter : int (optional)
        maximum number of iterations
    eps : float (optional)
        RMSD distance at which reference and average are assumed to be equal
    verbose : bool (optional)
        verbosity
    **kwargs : dict (optional)
        AverageStructure kwargs

    Returns
    -------
    res : AverageStructure
        AverageStructure result from the last iteration

    """
    drmsd = np.inf
    for i in range(niter):
        # found a converged structure
        if drmsd < eps:
            break

        res = AverageStructure(mobile, ref, weights, **kwargs).run()
        drmsd = mda.analysis.rms.rmsd(ref.positions, res.structure, weights=weights)
        ref.positions = res.positions

        if verbose:
            print("i = {}, rmsd-change = {:.2f}, ave-rmsd = {:.2f}, ave-msf = {:.2f}".
                  format(i, drmsd, res.rmsd, res.msf.mean()))

    return res

@Neel-Shah-29
Copy link

Hello , I would like to take up this issue, Kindly guide me through what needs to be done in this issue.
As per the issue, Do we need to add the iterative approach stated above to be implemented in the codebase?

@orbeckst
Copy link
Member

@Neel-Shah-29 , you'd add the iterative_average() function to the MDAnalysis.analysis.rms module. You'd add tests for the code that you add to the analysis/test_rms.py file. You'd add documentation for the new functionality.

There hasn't been a lot of discussion on this issue so it's quite possible that reviewers of your code might say that we want this functionality to follow the standard schema for analysis tools, namely being a class derived from AnalysisBase (see https://docs.mdanalysis.org/stable/documentation_pages/analysis/base.html).

You can start with a PR where you add the code provided by @kain88-de to the existing code base and make the new tests work. We'll then use your PR for further discussions.

Please also introduce yourself on the developer list with your GitHub handle (if you haven't done so already).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants