Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Updating sourmash multigather to save hash abundances to .unassigned.sig #1720

Merged
merged 11 commits into from
Jan 17, 2022

Conversation

keyabarve
Copy link
Contributor

Fixes #1707

TODO:

  • Start by writing a test for multigather based on the gather tests for inflation, and make sure that fails on multigather;
  • Fix multigather's implementation to pass the test by using the inflate method

@codecov
Copy link

codecov bot commented Aug 18, 2021

Codecov Report

Merging #1720 (5b2d7aa) into latest (61e9534) will increase coverage by 6.75%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           latest    #1720      +/-   ##
==========================================
+ Coverage   83.42%   90.18%   +6.75%     
==========================================
  Files         113       87      -26     
  Lines       12154     8458    -3696     
  Branches     1620     1621       +1     
==========================================
- Hits        10140     7628    -2512     
+ Misses       1754      572    -1182     
+ Partials      260      258       -2     
Flag Coverage Δ
python 90.18% <100.00%> (+0.08%) ⬆️
rust ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/sourmash/cli/multigather.py 100.00% <100.00%> (ø)
src/sourmash/commands.py 88.22% <100.00%> (+0.92%) ⬆️
src/core/src/index/storage.rs
src/core/src/ffi/hyperloglog.rs
src/core/tests/test.rs
src/core/src/ffi/utils.rs
src/core/src/ffi/minhash.rs
src/core/tests/minhash.rs
src/core/src/index/sbt/mod.rs
src/core/src/sketch/nodegraph.rs
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 61e9534...5b2d7aa. Read the comment docs.

@keyabarve
Copy link
Contributor Author

@ctb Could you please provide some guidance on how to get started?

@ctb
Copy link
Contributor

ctb commented Aug 21, 2021

hi @keyabarve the first bullet point is:

  • Start by writing a test for multigather based on the gather tests for inflation, and make sure that fails on multigather;

can you identify a test or tests that you would start by copying and editing to work with multigather intead of gather?

@keyabarve
Copy link
Contributor Author

hi @keyabarve the first bullet point is:

  • Start by writing a test for multigather based on the gather tests for inflation, and make sure that fails on multigather;

can you identify a test or tests that you would start by copying and editing to work with multigather intead of gather?

Do you mean from the minhash tests in tests/test_minhash.py? I couldn't find any specific tests there that use the gather command.

@ctb
Copy link
Contributor

ctb commented Aug 22, 2021

ISTR that you had a number of failing tests wrt sourmash gather when you were working on adding the inflate method to MinHash. Maybe choose one of those? I think there's even one with abund and unassigned in the name which would probably be the one to use here.

@keyabarve
Copy link
Contributor Author

Do you mean this test: tests/test_sourmash.py::test_gather_output_unassigned_with_abundance?

@ctb
Copy link
Contributor

ctb commented Aug 23, 2021 via email

@keyabarve
Copy link
Contributor Author

@ctb I have written the function for multigather and it fails. What should be my next step? To do this: "Fix multigather's implementation to pass the test by using the inflate method", should I fix the implementation in def multigather in src/sourmash/commands.py?

@keyabarve
Copy link
Contributor Author

@bluegenes Could you provide some suggestions on this?

@bluegenes
Copy link
Contributor

should I fix the implementation in def multigather in src/sourmash/commands.py?

yes

@keyabarve
Copy link
Contributor Author

@ctb I have added the code in the multigather function but the test I wrote still fails. I'm not sure how to fix that and whether what I'm doing is correct. Could you provide any hints/suggestions?

tests/test_sourmash.py Outdated Show resolved Hide resolved
tests/test_sourmash.py Outdated Show resolved Hide resolved
@keyabarve
Copy link
Contributor Author

@ctb I have changed the command, but the test still fails. I can't seem to figure out what the issue is. Could you provide any hints/suggestions?

@ctb
Copy link
Contributor

ctb commented Sep 18, 2021

Hi @keyabarve the tests now pass. What's next?

@keyabarve
Copy link
Contributor Author

I was thinking of writing a few more tests, maybe specifically those which use the inflate method.

@keyabarve keyabarve changed the title [WIP] Updating sourmash multigather to save hash abundances to .unassigned.sig [MRG] Updating sourmash multigather to save hash abundances to .unassigned.sig Nov 20, 2021
@keyabarve
Copy link
Contributor Author

@ctb @mr-eyes I believe this is ready to review and merge?

@ctb
Copy link
Contributor

ctb commented Nov 20, 2021

I was thinking of writing a few more tests, maybe specifically those which use the inflate method.

just to confirm - did you end up doing this, or deciding it wasn't needed?

@ctb
Copy link
Contributor

ctb commented Nov 20, 2021

note that the failing test is not your fault, @keyabarve - it's the one that has been failing on latest due to a Python version upgrade. Still can't replicate it, will work on it over the next week or two.

If you decide not to add any more tests for inflate as part of this PR, please add a new issue referencing the changes in this PR saying "add additional tests to MinHash.inflate(...)". thanks!

src/sourmash/minhash.py Outdated Show resolved Hide resolved
@ctb ctb merged commit 91ed12f into latest Jan 17, 2022
@ctb ctb deleted the KB_1707 branch January 17, 2022 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

update sourmash multigather to save hash abundances to .unassigned.sig
4 participants