-
Notifications
You must be signed in to change notification settings - Fork 670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PSFParser fails to read alternative resids #4189
Comments
In PDB files I believe the res id needs to be an integer. Is this possibly the same for PSF files? |
Generally .pdb and .psf resnums ( 52 SER 52 SER...) can have letters in their code and the alternative conformations can be read from those. specifically: #this |
We already handle altlocs for PDBs and it's a topology attribute in its own rights. i.e. I believe resum has to be an int, so here we'd have to do a check on the type and handle it appropriately. Implementation aside, this simply seems like an edge case we just weren't aware of. Is there a PSF file standard defined somewhere? It would be easier to make sure we don't miss any further edge cases. |
To the best of my knowledge the only reference I could find for psf is the CHARMM-GUI section: These PSFs with alternative conformations derive from VMD's psfgen autopsf when starting from a structure that has altloc. Thank you! |
CHARMM is now free for academic use https://academiccharmm.org/news/free-charmm so one could look at the source. The CHARMM PSF is related to XPLOR PSF... but I could also not find a file format description there. The summary in CHARMM-GUI: Lesson 6: Protein Structure File Format might be the closest to a format description apart from the source code. It does not mention the alternating conformations, though. If we were to parse psfgen alternate conformations then we should do it so that
|
@pipitoludovico does the trajectory contain coordinates for BOTH alternatives? (What software would produce such trajectories?) |
If it's possible to get a copy of the PSF it might be easier to work out what's going on here - a wild guess is that there's only one altloc but that the tag is retained in the PSF. |
Expected behavior
Loading a PSF and a trajectory with alternate conformation inside.
Actual behavior
PSFParser expects only integers and fails when finding alternate residues:
example:
loading a psf that includes
...
2273 P2 52 SER HG1 H 0.430000 1.0080 0
2274 P2 52 SER C C 0.510000 12.0110 0
2275 P2 52 SER O O -0.510000 15.9990 0
2276 P2 52A ALA N NH1 -0.470000 14.0070 0
2277 P2 52A ALA HN H 0.310000 1.0080 0
2278 P2 52A ALA CA CT1 0.070000 12.0110 0
...
raises a ValueError:
ValueError: Failed to construct topology from file ionized.psf with parser <class 'MDAnalysis.topology.PSFParser.PSFParser'>.
Error: invalid literal for int() with base 10: '52A'
Code to reproduce the behavior
Current version of MDAnalysis
python -V
)? 3.10.11Thanks!
The text was updated successfully, but these errors were encountered: