Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: remove original sequences after gzipping #152

Merged

Conversation

misialq
Copy link
Collaborator

@misialq misialq commented Jan 25, 2023

Closes #151.

To test, fetch a significantly sized dataset and compare the size of the TMPDIR towards the end of the sequence fetch (but before the action finishes): the size should be significantly smaller with the changes in this PR, in contrast to the main version.

@misialq misialq requested a review from adamovanja January 25, 2023 10:04
@codecov
Copy link

codecov bot commented Jan 25, 2023

Codecov Report

Merging #152 (2af7060) into main (704bd98) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #152   +/-   ##
=======================================
  Coverage   98.62%   98.62%           
=======================================
  Files          29       29           
  Lines        2974     2980    +6     
=======================================
+ Hits         2933     2939    +6     
  Misses         41       41           
Impacted Files Coverage Δ
q2_fondue/sequences.py 98.48% <100.00%> (+0.01%) ⬆️
q2_fondue/tests/test_get_all.py 98.68% <100.00%> (+0.07%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@adamovanja adamovanja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this @misialq ⭐. The code changes look good to me.

Also, the used TMPDIR size did decrease significantly. Here the TMPDIR size occupied (in GB) when testing get-sequences for accession ID PRJEB11697:

# fetched run IDs MAIN FIX
42 3.15 0.52
76 5.27 0.86

@misialq misialq merged commit 71727c1 into bokulich-lab:main Jan 26, 2023
@misialq misialq deleted the issue-151-remove-tmp-seqs-after-download branch January 26, 2023 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Delete raw sequences after compression
2 participants