-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: add ProfileHMM[*]
semantic types
#328
Conversation
@misialq, would you be able to review this one? Update: @colinvwood is going to try to take a pass through this and merge today, so it's in the |
Hey @misialq, I didn't have time to look at this today. If you have time tomorrow to look at this then just let me know otherwise I'll plan to look at it tomorrow. Excuse all the pings 🥸 |
Hey @gregcaporaso, @colinvwood - sure thing, I already had a glance - there are some significant changes which I proposed to @Sann5 so please do not review yet - the contents will likely change. We'll ping you when ready, thanks! 🙏 |
Good to know, thanks @misialq. I converted this to a Draft pull request. Since we have the release next week, I'm going to bump this to the project board for the next release - let us know if it'll be an issue to not have this in 2024.5. |
Hey @gregcaporaso, thanks! No, I don't think it's an issue if we don't have it in 2024.5. We will probably want to test it out a bit together with our new moshpit action for eggnog so we may need some more time anyway :) |
ReferenceDB[HMMER]
semantic typeHMM[*]
semantic types
HMM[*]
semantic typesProfileHMM[*]
semantic types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Sann5, first "superficial" review and some change suggestions below. Will look at the tests and give it a go once you update.
Hey @Sann5, what's up with the two failing tests? |
@misialq I opened an issue in phammer complaining how the error thrown when loading a file with mixed profiles (DNA, RNA, Protein) was uninformative. They already fixed it and pushed the patch to conda. I will update the error handling accordingly here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Sann5, LGTM, thanks! If it's not too much trouble, do you think you could attach here this nice table you presented once in our meeting - it may be helpful in understanding what all the formats do 🙏
@lizgehret do you think you could check this out? :)
Sure thing! Profile HMM'sHow are they usedThe way they are usually used is:
One can also use profile HMMs to do sequence annotation or alignment. How are they storedProfile HMMs are different for different sequence types (e.g. DNA, RNA, and protein). Moreover, HMMER, the go-to software for biological sequence analysis with profile HMMs, saves profiles as text (or binary) files. One file can contain one or more profiles, each representing a group of sequences. However, no valid file can have profiles from more than one sequence type. Files with multiple profiles will be used to run some programs in HMMER while files with a single profile can run other programs. The proposalTo accommodate the different things that these profiles represent as well as the future use cases, this PR proposed the following semantic types.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this all looks reasonable, thanks @Sann5!
Closes #327.
Adds new semantic types for profile hidden markov models as implemented in the HMMER + tests and test data.