-
Notifications
You must be signed in to change notification settings - Fork 750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add find/unpigz module #7383
add find/unpigz module #7383
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, would be good to get a second 👀 again, maybe from @maxulysse as he looked atthe other?
Co-authored-by: James A. Fellows Yates <jfy133@gmail.com>
We might be able to trim this. Let me have a think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is over-complicating things a lot.
Can files be staged in a directory or is this breaking things?
See if this works for your use case:
process TASK {
input:
path files_in, stageAs: 'gzipped/*'
script:
prefix = task.ext.prefix?: 'meta'
def args = task.ext.args ?: ''
if( files_in.any{ file -> !file.name.endsWith('.gz') } ){
error("All files provided to this module must be gzipped (and have the .gz extension).")
}
if( files_in.any{ file -> file.baseName.startsWith(prefix) } ) {
error("No input files can start with the same name as the output prefix in the module FIND_UNPIGZ (currently '${prefix}'). Please choose a different one.")
}
"""
while IFS= read -r -d \$'\\0' file; do
unpigz \\
${args} \\
-cd \\
--processes ${task.cpus} \\
\$file \\
> ${prefix}.\$( basename \$file .gz )
done < <( find gzipped/ -name '*.gz' -print0 )
"""
output:
path "$prefix.*"
}
Edit, I just noticed there's a problem with file.name
in this context. It's not stripping the directory name like it should to check the prefix. Use baseName
instead.
Thanks for the suggestions @mahesh-panchal , that's far more elegant! |
Co-authored-by: Mahesh Binzer-Panchal <mahesh.binzer-panchal@nbis.se>
Co-authored-by: Mahesh Binzer-Panchal <mahesh.binzer-panchal@nbis.se>
Add a module for decompressing a large number of gzipped files, theoretically without hitting the terminal argument limit
PR checklist
versions.yml
file.label
nf-core modules test <MODULE> --profile docker
nf-core modules test <MODULE> --profile singularity
nf-core modules test <MODULE> --profile conda
nf-core subworkflows test <SUBWORKFLOW> --profile docker
nf-core subworkflows test <SUBWORKFLOW> --profile singularity
nf-core subworkflows test <SUBWORKFLOW> --profile conda