Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] add picklists to selector protocol and provide initial Index support #1588

Merged
merged 47 commits into from
Jun 18, 2021

Conversation

ctb
Copy link
Contributor

@ctb ctb commented Jun 12, 2021

This PR adds picklists (#1587) into the selector protocol so that Index.signatures(...) and Index.find(...) both respect picklists, and also introduces it throughout the command-line interface.

In detail, this PR:

  • adjusts Index.select(...) to support a picklist keyword argument and adds picklist support to the main Index subclasses;
  • adds picklist support to SBT.select(...) and LCA_Database.select(...) as well;
  • adds picklist support to load_file_as_signatures(...)
  • refactors sourmash sig extract to pass its picklist into load_file_as_signatures(...), rather than do ad hoc picklisting;
  • adds --picklist support to compare, gather, index, lca index, prefetch, and search

Note, I decided against adding picklists to lca summarize and lca classify for now, since I think we may be end-of-life-ing LCA databases soon.

This does not implement any optimizations for picklist-intersected database searches yet, but they are coming - see #1590!

TODO:

  • add docs
  • verify that prefetch CSV can be used as a picklist
  • verify that 'index' with picklist == 'index' then 'search' with picklist

@codecov
Copy link

codecov bot commented Jun 12, 2021

Codecov Report

Merging #1588 (8e5fb8d) into add/picklist (14b87d4) will decrease coverage by 0.17%.
The diff coverage is 66.66%.

Impacted file tree graph

@@               Coverage Diff                @@
##           add/picklist    #1588      +/-   ##
================================================
- Coverage         89.34%   89.16%   -0.18%     
================================================
  Files                76       76              
  Lines              6717     6739      +22     
  Branches           1198     1207       +9     
================================================
+ Hits               6001     6009       +8     
- Misses              507      514       +7     
- Partials            209      216       +7     
Flag Coverage Δ
python 89.16% <66.66%> (-0.18%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/sourmash/sig/picklist.py 87.17% <40.00%> (-7.35%) ⬇️
src/sourmash/lca/lca_db.py 90.21% <60.00%> (-1.01%) ⬇️
src/sourmash/sbt.py 80.75% <63.63%> (-0.30%) ⬇️
src/sourmash/index.py 99.03% <100.00%> (+<0.01%) ⬆️
src/sourmash/sig/__main__.py 91.64% <100.00%> (-0.04%) ⬇️
src/sourmash/sourmash_args.py 93.71% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 14b87d4...8e5fb8d. Read the comment docs.

@ctb ctb changed the title [EXP] add picklists to selector protocol and provide initial Index support [WIP] add picklists to selector protocol and provide initial Index support Jun 17, 2021
@codecov-commenter
Copy link

codecov-commenter commented Jun 17, 2021

Codecov Report

Merging #1588 (565428b) into latest (a2d438d) will increase coverage by 0.10%.
The diff coverage is 99.11%.

Impacted file tree graph

@@            Coverage Diff             @@
##           latest    #1588      +/-   ##
==========================================
+ Coverage   81.19%   81.29%   +0.10%     
==========================================
  Files         103      103              
  Lines       10423    10485      +62     
  Branches     1198     1217      +19     
==========================================
+ Hits         8463     8524      +61     
- Misses       1751     1753       +2     
+ Partials      209      208       -1     
Flag Coverage Δ
python 89.42% <99.11%> (+0.08%) ⬆️
rust 66.43% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/sourmash/commands.py 86.46% <95.23%> (+0.21%) ⬆️
src/sourmash/cli/compare.py 100.00% <100.00%> (ø)
src/sourmash/cli/gather.py 100.00% <100.00%> (ø)
src/sourmash/cli/index.py 100.00% <100.00%> (ø)
src/sourmash/cli/lca/index.py 100.00% <100.00%> (ø)
src/sourmash/cli/prefetch.py 100.00% <100.00%> (ø)
src/sourmash/cli/search.py 100.00% <100.00%> (ø)
src/sourmash/cli/sig/extract.py 100.00% <100.00%> (ø)
src/sourmash/cli/utils.py 100.00% <100.00%> (ø)
src/sourmash/index.py 99.03% <100.00%> (+<0.01%) ⬆️
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a2d438d...565428b. Read the comment docs.

@ctb ctb changed the title [WIP] add picklists to selector protocol and provide initial Index support [MRG] add picklists to selector protocol and provide initial Index support Jun 17, 2021
@ctb
Copy link
Contributor Author

ctb commented Jun 17, 2021

Ready for review @bluegenes. There's a few minor issues to do with our test infrastructure that I'll deal with before considering merge, but the PR is feature complete ;)

@ctb
Copy link
Contributor Author

ctb commented Jun 17, 2021

ok, tests are actually passing now, too! Ready for review and maybe merge!

Copy link
Contributor

@bluegenes bluegenes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! looking forward to actually playing around with picklists soon :)

@ctb ctb merged commit 74de59a into latest Jun 18, 2021
@ctb ctb deleted the add/picklist_selectors branch June 18, 2021 02:23
@ctb
Copy link
Contributor Author

ctb commented Jun 18, 2021

thank you!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants