-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
checkv database version #73
Comments
Hi @apcamargo ! |
Hi @efratmuller Thank you! Although I should say that @snayfach did most of the heavy lifting to generate the data resource :) Any reason you are using version 1.0 of the database? The latest version is 1.5 and we used version 1.4 to get the estimates that are listed in the metadata file. |
Thanks @apcamargo for your quick reply! I have re-run checkV with db version 1.4 but for some reason I'm still getting weird discrepancies and I was wondering whether you have any idea as to what could be the reason? A few concrete examples: (1) "UHGV-0889122" (representative of vOTU-155218) is reported to have an estimated completeness 76.09 (as listed in your metadata file), but when I ran checkv I got a completeness estimation of 43.07. Overall quality category in your metadata was "medium quality" while mine came out "low-quality". Genome lengths and cds_count were the same (just as a sanity). Notably, the "checkv completeness method" in my run was "HMM-based (lower-bound)" and in yours it was "AAI-based". (2) "UHGV-0404930" (rep of vOTU-052718) has 100% completeness in your metadata (quality = "Complete"), but only 82.72% completeness in my run (quality = "Medium-quality"). Again, genome length and cds_count are the same. Completeness method in my run was "AAI-based (high-confidence)" and in your metadata table it is "DTR". Overall, ~25% of the genomes (in the MQ version) seem to have different quality categories in your checkv run vs. mine. Any help figuring out what are we doing differently will be greatly appreciated!! Many thanks in advance (and Merry Christmas), |
Hi @efratmuller, sorry for the late reply (and thanks for all the details!) It looks like the discrepancy is because the complete genomes of UHGV were added to the CheckV database before the completeness of the other genomes was estimated. I’ll update the CheckV database before the preprint is out to make sure everything is reproducible. In the meantime, you can set up a custom database on your end if you need it. |
Thanks @apcamargo , I appreciate the clarification! |
Could you can tell me which checkv database version were used? Because I found the different checkv database version the result is also different.
Thank you!
The text was updated successfully, but these errors were encountered: