-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix 3'-rule errors for certain sequence patterns #75
Conversation
Codecov Report
@@ Coverage Diff @@
## master #75 +/- ##
==========================================
- Coverage 45.63% 45.60% -0.03%
==========================================
Files 16 16
Lines 2003 2004 +1
Branches 64 64
==========================================
Hits 914 914
- Misses 1025 1026 +1
Partials 64 64
|
d8b3818
to
b4be611
Compare
b4be611
to
c614b8e
Compare
(if (= alt \*) | ||
(protein-substitution (+ ppos offset) (str ref) (str alt)) ; eventually fs-ter-substitution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📝 In certain cases, frameshift would be fixed as fs-ter-substitution
by the 3'-rule application to protein sequence. So I have added the fs-ter-substition
check here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, such cases potentially occur as you said. It is extremely rare.
Do you have any case of that? I would like you to add the test case if you already have.
No problem even if you do not have the case. I will approve this PR even then because I think it is a bit difficult to prepare the test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out! I think my REPL history still contains the details. I will try to retrieve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, the variant query I attempted appears no longer comes in this path by the backward-shift
fix. 🤔
Instead, I have added test cases for varity.vcf-to-hgvs.protein/mutation
in 1854562, which includes inputs before and after the 3'-rule of coding DNA. The 3'-rule for protein sequence and fs-ter-substitution
check work correctly in both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, thank you very much. The test case using varity.vcf-to-hgvs.protein/mutation
looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the fix! LGTM👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for fixing the 3'-rule. The implementation looks good to me. I have left only a comment about a test case of fs-ter-substitution
after 3'-rule.
(if (= alt \*) | ||
(protein-substitution (+ ppos offset) (str ref) (str alt)) ; eventually fs-ter-substitution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, such cases potentially occur as you said. It is extremely rare.
Do you have any case of that? I would like you to add the test case if you already have.
No problem even if you do not have the case. I will approve this PR even then because I think it is a bit difficult to prepare the test case.
6b2ef4c
to
1854562
Compare
(if (= alt \*) | ||
(protein-substitution (+ ppos offset) (str ref) (str alt)) ; eventually fs-ter-substitution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, thank you very much. The test case using varity.vcf-to-hgvs.protein/mutation
looks good to me.
Resolve #72 and #74.
This PR addresses the problem of 3' shifting reported in issue #74 and the out-of-bounds exception reported in issue #72.
I have made some refinements to the
varity.vcf-to-hgvs
. Specifically, I have fixedvarity.vcf-to-hgvs.common/backward-shift
function to work correctly for certain sequence patterns and added error handling tovarity.vcf-to-hgvs.common/apply-3'-rule
for rare cases.Additionally, I have fixed the exon boundary handling in
varity.vcf-to-hgvs.protein/read-sequence-info
, which was implemented in #68.Added test cases is originates from #71 and #73 by @nokara26 Thanks! ❤️