Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic listing of citations/references of used tools in MultiQC methods section #308

Merged
merged 26 commits into from
Jul 17, 2023
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
17d163a
Start adding dynamic citation insertion into methods description
jfy133 Jun 17, 2023
b4420eb
Add bibliography insert as well!
jfy133 Jun 17, 2023
d5b34f5
Update changelog
jfy133 Jun 17, 2023
4a0cae4
phrasing
jfy133 Jun 17, 2023
5c77e37
Apply suggestions from code review
jfy133 Jun 18, 2023
8359b06
Update methods_description_template.yml
jfy133 Jun 18, 2023
7742ab5
Update methods_description_template.yml
jfy133 Jun 18, 2023
b634084
Fix filtlong formating in citations
jfy133 Jun 25, 2023
462a752
Merge branch 'dev' into print-tool-citations
jfy133 Jul 7, 2023
7493860
Fix comma space dot issue
jfy133 Jul 12, 2023
eac28ba
Add check for database header to close #310
jfy133 Jun 29, 2023
5eef75d
Remove empty lines
jfy133 Jun 29, 2023
662b532
No need to check outdir for existence - Nextflow can create it on demand
robsyme Jul 7, 2023
f44f110
Add support for virus -e
jfy133 Jul 10, 2023
2e8a836
Update taxpasta version and add ganon support
jfy133 Jul 10, 2023
d72f2ba
Update CHANGELOG
jfy133 Jul 10, 2023
1b790e5
Merge branch 'dev' into print-tool-citations
jfy133 Jul 12, 2023
e5f7cce
Fix changelog
jfy133 Jul 12, 2023
dfff309
Start conmversoipn to lists
jfy133 Jul 12, 2023
d32774f
Add filtlong citation
jfy133 Jul 13, 2023
929fc4d
Switch to list based with better replaceAll
jfy133 Jul 13, 2023
bc5debb
Finalise text with DOI links!
jfy133 Jul 13, 2023
54b60ea
Merge branch 'dev' into print-tool-citations
jfy133 Jul 17, 2023
ee80255
Standardise and sync citation styles between citations.md and toolBib…
jfy133 Jul 17, 2023
4b5d82a
Update Mp3 to MP4 citation
jfy133 Jul 17, 2023
a19951d
And with citations
jfy133 Jul 17, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#276](https://github.com/nf-core/taxprofiler/pull/276) Implemented batching in the KrakenUniq samples processing. (added by @Midnighter)
- [#272](https://github.com/nf-core/taxprofiler/pull/272) Add saving of final 'analysis-ready-reads' to dedicated directory. (❤️ to @alexhbnr for reporting, added by @jfy133)
- [#303](https://github.com/nf-core/taxprofiler/pull/303) Add support for taxpasta profile standardisation in single sample pipeline runs (❤️ to @artur-matysik for reporting, added by @jfy133)
- [#308](https://github.com/nf-core/taxprofiler/pull/308) Add citations and bibliographic information to the MultiQC methods text of tools used in a given pipeline run (added by @jfy133)

### `Fixed`

Expand Down
2 changes: 2 additions & 0 deletions assets/methods_description_template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,12 @@ data: |
<p>Data was processed using nf-core/taxprofiler v${workflow.manifest.version} ${doi_text} of the nf-core collection of workflows (<a href="https://doi.org/10.1038/s41587-020-0439-x">Ewels <em>et al.</em>, 2020</a>).</p>
<p>The pipeline was executed with Nextflow v${workflow.nextflow.version} (<a href="https://doi.org/10.1038/nbt.3820">Di Tommaso <em>et al.</em>, 2017</a>) with the following command:</p>
<pre><code>${workflow.commandLine}</code></pre>
<p>${tool_citations}</p>
jfy133 marked this conversation as resolved.
Show resolved Hide resolved
<h4>References</h4>
<ul>
<li>Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., & Notredame, C. (2017). Nextflow enables reproducible computational workflows. Nature Biotechnology, 35(4), 316-319. <a href="https://doi.org/10.1038/nbt.3820">https://doi.org/10.1038/nbt.3820</a></li>
<li>Ewels, P. A., Peltzer, A., Fillinger, S., Patel, H., Alneberg, J., Wilm, A., Garcia, M. U., Di Tommaso, P., & Nahnsen, S. (2020). The nf-core framework for community-curated bioinformatics pipelines. Nature Biotechnology, 38(3), 276-278. <a href="https://doi.org/10.1038/s41587-020-0439-x">https://doi.org/10.1038/s41587-020-0439-x</a></li>
${tool_bibliography}
</ul>
<div class="alert alert-info">
<h5>Notes:</h5>
Expand Down
97 changes: 96 additions & 1 deletion lib/WorkflowTaxprofiler.groovy
jfy133 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -46,15 +46,110 @@ class WorkflowTaxprofiler {
return yaml_file_text
}

public static String methodsDescriptionText(run_workflow, mqc_methods_yaml) {
///
/// Automatic publication methods text generation
///

public static String toolCitationText(params) {

// TODO consider how to do the same for the references themselves, include in the same if/else statements somehow?
// TODO add biocontainers/bioconda/singularity to default HTML text!
jfy133 marked this conversation as resolved.
Show resolved Hide resolved
def citation_text = [
"Tools used in the workflow included",
params["preprocessing_qc_tool"] == "falco" ? "Falco (de Sena Brandine and Smith 2021)." : "FastQC (Andrews 2010).", // TODO OR FALCO

params["perform_shortread_qc"] ? ". Short read preprocessing was carried out with" : "",
params["perform_shortread_qc"] && params["shortread_qc_tool"] == "adapterremoval" ? "AdapterRemoval (Schubert et al. 2016)." : "",
params["perform_shortread_qc"] && params["shortread_qc_tool"] == "fastp" ? "fastp (Chen et al. 2018)." : "",

params["perform_longread_qc"] ? ". Long read preprocessing was carried out with" : "",
params["perform_longread_qc"] && !params["longread_qc_skipadaptertrim"] ? "Porechop (Wick 2018)," : "",
params["perform_longread_qc"] && !params["longread_qc_skipqualityfilter"] ? "Filtlong (Wick 2021)," : "",

params["perform_shortread_complexityfilter"] ? ". Complexity filtering was performed using" : "",
params["perform_shortread_complexityfilter"] && params["shortread_complexityfilter_tool"] == "bbduk" ? "BBDuk (Bushnell 2022)," : "",
params["perform_shortread_complexityfilter"] && params["shortread_complexityfilter_tool"] == "prinseqplusplus" ? "PRINSEQ++ (Cantu et al. 2019)," : "",
params["perform_shortread_complexityfilter"] && params["shortread_complexityfilter_tool"] == "fastp" ? "fastp (Chen et al. 2018)," : "",

params["perform_shortread_hostremoval"] ? ". Host read removal was carried out for short reads with Bowtie2 (Langmead and Salzberg 2012)" : "",
params["perform_longread_hostremoval"] ? ". Host read removal was carried out for long reads with minimap2 (Li et al. 2018)" : "",
params["perform_shortread_hostremoval"] || params["perform_longread_hostremoval"] ? "and SAMtools (Danecek et al. 2021)." : "",

". Taxonomic classification or profiling was performed with",
params["run_bracken"] ? "Bracken (Lu et al. 2017)," : "",
params["run_kraken2"] ? "Kraken2 (Wood et al. 2019)," : "",
params["run_krakenuniq"] ? "KrakenUniq (Breitwieser et al. 2018)," : "",
params["run_metaphlan3"] ? "MetaPhlAn3 (Beghini et al. 2021)," : "",
params["run_malt"] ? "MALT (Vågene et al. 2018) and MEGAN6 CE (Huson et al. 2016)," : "",
params["run_diamond"] ? "DIAMOND (Buchfink et al. 2015)," : "",
params["run_centrifuge"] ? "Centrifuge (Kim et al. 2016)," : "",
params["run_kaiju"] ? "Kaiju (Menzel et al. 2016)," : "",
params["run_motus"] ? "mOTUs (Ruscheweyh et al. 2022)," : "",

params["run_krona"] ? ". Results visualisation for some tools were displayed with Krona (Ondov et al. 2011)." : "",

". Pipeline results statistics were summarised with MultiQC (Ewels et al. 2016)."
].join(' ').trim()

return citation_text
}

public static String toolBibliographyText(params) {

// TODO consider how to do the same for the references themselves, include in the same if/else statements somehow?
def reference_text = [
params["preprocessing_qc_tool"] == "falco" ? "<li>de Sena Brandine G and Smith AD. Falco: high-speed FastQC emulation for quality control of sequencing data. F1000Research 2021, 8:1874</li>" : "Andrews S, (2010) FastQC, URL: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/).", // TODO OR FALCO
jfy133 marked this conversation as resolved.
Show resolved Hide resolved

params["perform_shortread_qc"] && params["shortread_qc_tool"] == "adapterremoval" ? "<li>Schubert, Mikkel, Stinus Lindgreen, and Ludovic Orlando. 2016. AdapterRemoval v2: Rapid Adapter Trimming, Identification, and Read Merging. BMC Research Notes 9 (February): 88. doi:10.1186/s13104-016-1900-2.</li>" : "",
params["perform_shortread_qc"] && params["shortread_qc_tool"] == "fastp" ? "<li>Chen, Shifu, Yanqing Zhou, Yaru Chen, and Jia Gu. 2018. Fastp: An Ultra-Fast All-in-One FASTQ Preprocessor. Bioinformatics 34 (17): i884-90. 10.1093/bioinformatics/bty560.</li>" : "",

params["perform_longread_qc"] && !params["longread_qc_skipadaptertrim"] ? "<li>Wick R (2018) Porechop, URL: https://github.com/rrwick/Porechop</li>" : "",
params["perform_longread_qc"] && !params["longread_qc_skipqualityfilter"] ? "<li>Wick R (2021) Filtlong, URL: https://github.com/rrwick/Filtlong</li>" : "",

params["perform_shortread_complexityfilter"] && params["shortread_complexityfilter_tool"] == "bbduk" ? "<li>Bushnell B (2022) BBMap, URL: http://sourceforge.net/projects/bbmap/</li>" : "",
params["perform_shortread_complexityfilter"] && params["shortread_complexityfilter_tool"] == "prinseqplusplus" ? "<li>Cantu, Vito Adrian, Jeffrey Sadural, and Robert Edwards. 2019. PRINSEQ++, a Multi-Threaded Tool for Fast and Efficient Quality Control and Preprocessing of Sequencing Datasets. e27553v1. PeerJ Preprints. doi: 10.7287/peerj.preprints.27553v1.</li>" : "",
params["perform_shortread_complexityfilter"] && params["shortread_complexityfilter_tool"] == "fastp" ? "<li>Chen, Shifu, Yanqing Zhou, Yaru Chen, and Jia Gu. 2018. Fastp: An Ultra-Fast All-in-One FASTQ Preprocessor. Bioinformatics 34 (17): i884-90. 10.1093/bioinformatics/bty560.</li>" : "",

params["perform_shortread_hostremoval"] ? "<li>Langmead, B., & Salzberg, S. L. (2012). Fast gapped-read alignment with Bowtie 2. Nature Methods, 9(4), 357–359. doi: 10.1038/nmeth.1923</li>" : "",
params["perform_longread_hostremoval"] ? "<li>Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics , 34(18), 3094–3100. doi: 10.1093/bioinformatics/bty191</li>" : "",
params["perform_shortread_hostremoval"] || params["perform_longread_hostremoval"] ? "<li>Danecek, P., Bonfield, J. K., Liddle, J., Marshall, J., Ohan, V., Pollard, M. O., Whitwham, A., Keane, T., McCarthy, S. A., Davies, R. M., & Li, H. (2021). Twelve years of SAMtools and BCFtools. GigaScience, 10(2). doi: 10.1093/gigascience/giab008</li>" : "",

params["run_bracken"] ? "<li>Lu, J., Breitwieser, F. P., Thielen, P., & Salzberg, S. L. (2017). Bracken: Estimating species abundance in metagenomics data. PeerJ Computer Science, 3, e104. doi: 10.7717/peerj-cs.104</li>" : "",
params["run_kraken2"] ? "<li>Wood, Derrick E., Jennifer Lu, and Ben Langmead. 2019. Improved Metagenomic Analysis with Kraken 2. Genome Biology 20 (1): 257. doi: 10.1186/s13059-019-1891-0.</li>" : "",
params["run_krakenuniq"] ? "<li>Breitwieser, Florian P., Daniel N. Baker, and Steven L. Salzberg. 2018. KrakenUniq: confident and fast metagenomics classification using unique k-mer counts. Genome Biology 19 (1): 198. doi: 10.1186/s13059-018-1568-0</li>" : "",
params["run_metaphlan3"] ? "<li>Beghini, Francesco, Lauren J McIver, Aitor Blanco-Míguez, Leonard Dubois, Francesco Asnicar, Sagun Maharjan, Ana Mailyan, et al. 2021. “Integrating Taxonomic, Functional, and Strain-Level Profiling of Diverse Microbial Communities with BioBakery 3.” Edited by Peter Turnbaugh, Eduardo Franco, and C Titus Brown. ELife 10 (May): e65088. doi: 10.7554/eLife.65088</li>" : "",
params["run_malt"] ? "<li>Vågene, Åshild J., Alexander Herbig, Michael G. Campana, Nelly M. Robles García, Christina Warinner, Susanna Sabin, Maria A. Spyrou, et al. 2018. Salmonella Enterica Genomes from Victims of a Major Sixteenth-Century Epidemic in Mexico. Nature Ecology & Evolution 2 (3): 520-28. doi: 10.1038/s41559-017-0446-6.</li>" : "",
params["run_malt"] ? "<li>Huson, Daniel H., Sina Beier, Isabell Flade, Anna Górska, Mohamed El-Hadidi, Suparna Mitra, Hans-Joachim Ruscheweyh, and Rewati Tappu. 2016. “MEGAN Community Edition - Interactive Exploration and Analysis of Large-Scale Microbiome Sequencing Data.” PLoS Computational Biology 12 (6): e1004957. doi: 10.1371/journal.pcbi.1004957.</li>" : "",
params["run_diamond"] ? "<li>Buchfink, Benjamin, Chao Xie, and Daniel H. Huson. 2015. “Fast and Sensitive Protein Alignment Using DIAMOND.” Nature Methods 12 (1): 59-60. doi: 10.1038/nmeth.3176.</li>" : "",
params["run_centrifuge"] ? "<li>Kim, Daehwan, Li Song, Florian P. Breitwieser, and Steven L. Salzberg. 2016. “Centrifuge: Rapid and Sensitive Classification of Metagenomic Sequences.” Genome Research 26 (12): 1721-29. doi: 10.1101/gr.210641.116.</li>" : "",
params["run_kaiju"] ? "<li>Menzel, P., Ng, K. L., & Krogh, A. (2016). Fast and sensitive taxonomic classification for metagenomics with Kaiju. Nature Communications, 7, 11257. doi: 10.1038/ncomms11257</li>" : "",
params["run_motus"] ? "<li>Ruscheweyh, H.-J., Milanese, A., Paoli, L., Karcher, N., Clayssen, Q., Keller, M. I., Wirbel, J., Bork, P., Mende, D. R., Zeller, G., & Sunagawa, S. (2022). Cultivation-independent genomes greatly expand taxonomic-profiling capabilities of mOTUs across various environments. Microbiome, 10(1), 212. doi: 10.1186/s40168-022-01410-z</li>" : "",
params["run_krona"] ? "<li>Ondov, Brian D., Nicholas H. Bergman, and Adam M. Phillippy. 2011. Interactive metagenomic visualization in a Web browser. BMC Bioinformatics 12 (1): 385. doi: 10.1186/1471-2105-12-385.</li>" : "",

"<li>Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.</li>"
].join(' ').trim()

return reference_text
}

public static String methodsDescriptionText(run_workflow, mqc_methods_yaml, params) {
// Convert to a named map so can be used as with familar NXF ${workflow} variable syntax in the MultiQC YML file
def meta = [:]

meta.workflow = run_workflow.toMap()
meta["manifest_map"] = run_workflow.manifest.toMap()

meta["doi_text"] = meta.manifest_map.doi ? "(doi: <a href=\'https://doi.org/${meta.manifest_map.doi}\'>${meta.manifest_map.doi}</a>)" : ""
meta["nodoi_text"] = meta.manifest_map.doi ? "": "<li>If available, make sure to update the text to include the Zenodo DOI of version of the pipeline used. </li>"

meta["tool_citations"] = ""
meta["tool_bibliography"] = ""
/*
TODO Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
*/
meta["tool_citations"] = toolCitationText(params).replaceAll(", \\.", ".").replaceAll("\\. \\.", ".").replaceAll(", \\.", ".")
meta["tool_bibliography"] = toolBibliographyText(params)

def methods_text = mqc_methods_yaml.text

def engine = new SimpleTemplateEngine()
Expand Down
2 changes: 1 addition & 1 deletion workflows/taxprofiler.nf
Original file line number Diff line number Diff line change
Expand Up @@ -264,7 +264,7 @@ workflow TAXPROFILER {
workflow_summary = WorkflowTaxprofiler.paramsSummaryMultiqc(workflow, summary_params)
ch_workflow_summary = Channel.value(workflow_summary)

methods_description = WorkflowTaxprofiler.methodsDescriptionText(workflow, ch_multiqc_custom_methods_description)
methods_description = WorkflowTaxprofiler.methodsDescriptionText(workflow, ch_multiqc_custom_methods_description, params)
ch_methods_description = Channel.value(methods_description)

ch_multiqc_files = Channel.empty()
Expand Down