Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lcaid and lcataxid options fail #119

Closed
mbrockman1 opened this issue Apr 11, 2018 · 5 comments
Closed

lcaid and lcataxid options fail #119

mbrockman1 opened this issue Apr 11, 2018 · 5 comments
Assignees
Labels
Milestone

Comments

@mbrockman1
Copy link

Versions:

Linux 4.13.0-38-generic #43~16.04.1-Ubuntu SMP Wed Mar 14 17:48:43 UTC 2018
x86_64 x86_64 x86_64 GNU/Linux

GNU bash, version 4.3.48(1)-release (x86_64-pc-linux-gnu)

lambda2 version: 1.9.4
SeqAn version: 2.4.0

After running:

lambda2 searchp --query pro_test.fa --index ./protein.lambda/ --output-columns 'std lcataxid'

I receive:

Reading index properties... done.
Detecting query alphabet... aminoacid detected.
Checking memory requirements... met.
Loading Subj Sequences... done.
Loading Subj Ids... done.
Loading Database Index... done.
Loading Subject Taxonomy IDs... done.
Loading Subject Taxonomic Tree... done.
done.
Loading Query Sequences and Ids... done.
Searching and extending hits on-line...progress:
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
|...~/lambda/src/search_misc.hpp:272 FAILED! (One of the paths didn't lead to root.)

stack trace:
0 [0x4c4a89] lambda2()
1 [0x47b306] lambda2()
2 [0x6662a8] lambda2()
3 [0x55680b] lambda2()
4 [0x7f5dbd0c9cbf] GOMP_parallel + 0x3f
5 [0x5ebbd8] lambda2()
6 [0x4c9436] argConv0(LambdaOptions&) + 0xfa6
7 [0x4e470a] searchMain(int, char const**) + 0x31a
8 [0x44d632] main + 0x1e2
9 [0x7f5dbdfbd830] __libc_start_main + 0xf0
10 [0x44e229] _start + 0x29

Aborted (core dumped)

@h-2
Copy link
Member

h-2 commented Apr 11, 2018

Thanks for your detailed bug-report!

Can you make the files available to me for debugging? This doesn't need to be on GitHub in case the data sensitive, but it would really help! This is my email-address: hannes.hauswedell@fu-berlin.de

@mbrockman1
Copy link
Author

mbrockman1 commented Apr 11, 2018

I used the master branch to build from source.

I am using a list of data sensitive sequences. It seems as if one sequence is messing it up for the rest. I need to do further investigation.

The database I used was Uniprot Sprot 20171205.

Maybe it is because I am using a significantly small database, compared to the larger ones.

@h-2
Copy link
Member

h-2 commented May 9, 2018

Maybe it is because I am using a significantly small database, compared to the larger ones.

No, I am pretty sure it's a bug in my routines for compressing the lca-tree. I will be on vacation for the next two weeks, but I will look into it immediately after that. Sorry it has taken me so long to respond to this.

@h-2 h-2 self-assigned this May 9, 2018
@h-2 h-2 added the bug label May 9, 2018
@h-2 h-2 added this to the 1.9.5 milestone May 9, 2018
@h-2
Copy link
Member

h-2 commented May 25, 2018

I have reproduced the issue and this patch fixes it for me:

--- a/src/search_algo.hpp
+++ b/src/search_algo.hpp
@@ -1387,7 +1387,7 @@ _writeRecord(TBlastRecord & record,
             record.lcaTaxId = 0;
             for (auto const & bm : record.matches)
             {
-                if (length(lH.gH.sTaxIds[bm._n_sId]) > 0)
+                if ((length(lH.gH.sTaxIds[bm._n_sId]) > 0) && (lH.gH.taxParents[lH.gH.sTaxIds[bm._n_sId][0]] != 0))
                 {
                     record.lcaTaxId = lH.gH.sTaxIds[bm._n_sId][0];
                     break;
@@ -1397,7 +1397,7 @@ _writeRecord(TBlastRecord & record,
             if (record.lcaTaxId != 0)
                 for (auto const & bm : record.matches)
                     for (uint32_t const sTaxId : lH.gH.sTaxIds[bm._n_sId])
-                        if (sTaxId != 0) // TODO do we want to skip unassigned subjects
+                        if (lH.gH.taxParents[sTaxId] != 0) // TODO do we want to skip unassigned subjects
                             record.lcaTaxId = computeLCA(lH.gH.taxParents, lH.gH.taxHeights, sTaxId, record.lcaTaxId);

             record.lcaId = lH.gH.taxNames[record.lcaTaxId];

Could you give it a try and report back?

Thanks a lot!

@h-2
Copy link
Member

h-2 commented May 28, 2018

This is now also in the master branch and will soon be released as 1.9.5. Please re-open this issue if it still doesn't work for you!

@h-2 h-2 closed this as completed May 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants