Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output for multiple Input file with same filename is confusing for different SeqFilter runs #12

Closed
greatfireball opened this issue Feb 12, 2019 · 4 comments

Comments

@greatfireball
Copy link
Member

greatfireball commented Feb 12, 2019

Due to only the filename is printed for the output table if SeqFilter was called with a single filename, it is confusing for multiple runs of SeqFilter each with a single input file with identical filenames.

Currently I am comparing different assemblies. To achieve faster output, I am using multiple SeqFilter calls. The assembly output filename is called transcripts.fasta:

mkdir run_a run_b && \
echo -e ">blub\nACGT" >run_a/transcripts.fasta && \
echo -e ">bla\nACGT" >run_b/transcripts.fasta

In case I am using a single SeqFilter call, a unique path is printed:

SeqFilter run_a/transcripts.fasta run_b/transcripts.fasta 2>/dev/null | column -t
#source                  state  reads  bases  max  min  N50  N90
run_a/transcripts.fasta  RAW    1      15     15   15   15   15
run_b/transcripts.fasta  RAW    1      15     15   15   15   15
TOTAL                    RAW    2      30     15   15   15   15

but for two SeqFilter calls, it will result in an ambiguous list:

(SeqFilter run_a/transcripts.fasta 2>/dev/null; \
> SeqFilter run_b/transcripts.fasta 2>/dev/null) | column -t | | sed '2,$s/^#.*$//' | sed '/^$/d'
#source            state  reads  bases  max  min  N50  N90
transcripts.fasta  RAW    1      15     15   15   15   15
transcripts.fasta  RAW    1      15     15   15   15   15

Tested with SeqFilter 2.1.7 and 2.1.9; column and sed commands are not required and will only reformat the output.

The missing TOTAL line is obviously, but unfortunately, it is hard to identify the correct input file.

This might be solved by:

  1. printing the whole filename parameter instead of only the file name if specified
  2. enable printing of whole filenames via command line switch
  3. something different :)
@thackl
Copy link
Contributor

thackl commented Feb 12, 2019

The feature was actually already implemented, but I forgot to document it: --no-smart-labels

@greatfireball
Copy link
Member Author

Thanks @thackl You are the best!

@thackl
Copy link
Contributor

thackl commented Feb 12, 2019

Ah, I also got confused. I already added a minimal documentation for #12
7933dc5. Do you think it needs more? Since it was only a doc fix I didn't think about bumping version, but #7 warrants a bump I guess

@greatfireball
Copy link
Member Author

Agree... See #14 for further discussion :)

greatfireball added a commit that referenced this issue Feb 12, 2019
Fix #14 

Includes a fix for #7 and #12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants