Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] subtantial refactoring of CounterGather and related Index code. #2116

Merged
merged 45 commits into from
Jul 16, 2022
Merged
Show file tree
Hide file tree
Changes from 42 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
241dbc5
move most CounterGather tests over to index protocol tests
ctb Jul 8, 2022
66490a4
add LinearIndex wrapper
ctb Jul 8, 2022
ebb00ea
getting closer
ctb Jul 8, 2022
a8a4dd9
fix a bunch of the tests
ctb Jul 8, 2022
fdc8d4f
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Jul 8, 2022
ba114dd
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Jul 9, 2022
b444a68
fix call to 'peek'
ctb Jul 9, 2022
f87c9d4
adjust 'counter.add' call signature
ctb Jul 9, 2022
68458cf
add CounterGather_LCA
ctb Jul 9, 2022
b835c96
move CounterGather.calc_threshold into search.py
ctb Jul 9, 2022
1903920
minor refactoring
ctb Jul 9, 2022
5099d5a
resolve downsampling for linear index wrapper
ctb Jul 9, 2022
a8125b4
fix downsampling for LCA-based CounterGather
ctb Jul 9, 2022
1760ada
fix location foo
ctb Jul 9, 2022
5c9748a
fix remaining test
ctb Jul 9, 2022
c2d2637
minor cleanup
ctb Jul 10, 2022
6f9eb78
add doc
ctb Jul 10, 2022
f82e1d7
test multiple identical matches
ctb Jul 10, 2022
d9472ed
adjust LinearIndex implementation to skip identical matches
ctb Jul 10, 2022
3e1c1ae
switch to dictionaries for CounterGather
ctb Jul 11, 2022
4c14e01
cleanup protocol tests
ctb Jul 11, 2022
3df8c66
revert LCA_Database fix
ctb Jul 11, 2022
36d4c2c
Merge branch 'latest' into refactor/counter_gather_tests
ctb Jul 11, 2022
846c0ba
Merge branch 'refactor/counter_gather_tests' into update/counter_gather
ctb Jul 11, 2022
39835fc
restore CounterGather_LCA
ctb Jul 11, 2022
1a4e01b
cleanup
ctb Jul 11, 2022
ee0fd18
Merge branch 'refactor/counter_gather_tests' of https://github.com/so…
ctb Jul 11, 2022
dbabfe9
Merge branch 'latest' into refactor/counter_gather_tests
ctb Jul 11, 2022
b7c37bd
Merge branch 'refactor/counter_gather_tests' into update/counter_gather
ctb Jul 11, 2022
a676a69
Merge branch 'refactor/counter_gather_tests' into update/counter_gather
ctb Jul 11, 2022
402dbc6
fix or ignore most errors ;)
ctb Jul 11, 2022
0e4ca95
rename make_gather_query to make_containment_query
ctb Jul 12, 2022
9f7a20e
rename Index.gather to Index.best_containment
ctb Jul 12, 2022
cb2efd7
consolidate threshold_bp => threshold calc code
ctb Jul 12, 2022
22aa74c
change best_containment to return None or a result object, not a list
ctb Jul 12, 2022
e9022c7
add flatten_and_* utility functions
ctb Jul 13, 2022
2ace44f
add .signatures() method to CounterGather class
ctb Jul 13, 2022
430ef9d
change CounterGather to take SourmashSignature instead of Minhash
ctb Jul 13, 2022
f97b8e8
fix test_index tests for counter
ctb Jul 13, 2022
c6078a6
Merge branch 'latest' into refactor/counter_gather_tests
ctb Jul 13, 2022
db87d5e
lightly clean up LCA_Database based counter
ctb Jul 13, 2022
889e731
Merge branch 'refactor/counter_gather_tests' into update/counter_gather
ctb Jul 13, 2022
b5e497d
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Jul 16, 2022
f8e2edc
add comment and test re duplicate signatures, per @bluegenes
ctb Jul 16, 2022
61624fc
fix typo
ctb Jul 16, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions src/sourmash/commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -733,9 +733,9 @@ def gather(args):
else:
raise # re-raise other errors, if no picklist.

save_prefetch.add_many(counter.siglist)
save_prefetch.add_many(counter.signatures())
# subtract found hashes as we can.
for found_sig in counter.siglist:
for found_sig in counter.signatures():
noident_mh.remove_many(found_sig.minhash)

# optionally calculate and save prefetch csv
Expand Down Expand Up @@ -935,7 +935,7 @@ def multigather(args):
counters = []
for db in databases:
counter = db.counter_gather(prefetch_query, args.threshold_bp)
for found_sig in counter.siglist:
for found_sig in counter.signatures():
noident_mh.remove_many(found_sig.minhash)
counters.append(counter)

Expand Down
Loading