Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic listing of citations/references of used tools in MultiQC methods section #308

Merged
merged 26 commits into from
Jul 17, 2023
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
17d163a
Start adding dynamic citation insertion into methods description
jfy133 Jun 17, 2023
b4420eb
Add bibliography insert as well!
jfy133 Jun 17, 2023
d5b34f5
Update changelog
jfy133 Jun 17, 2023
4a0cae4
phrasing
jfy133 Jun 17, 2023
5c77e37
Apply suggestions from code review
jfy133 Jun 18, 2023
8359b06
Update methods_description_template.yml
jfy133 Jun 18, 2023
7742ab5
Update methods_description_template.yml
jfy133 Jun 18, 2023
b634084
Fix filtlong formating in citations
jfy133 Jun 25, 2023
462a752
Merge branch 'dev' into print-tool-citations
jfy133 Jul 7, 2023
7493860
Fix comma space dot issue
jfy133 Jul 12, 2023
eac28ba
Add check for database header to close #310
jfy133 Jun 29, 2023
5eef75d
Remove empty lines
jfy133 Jun 29, 2023
662b532
No need to check outdir for existence - Nextflow can create it on demand
robsyme Jul 7, 2023
f44f110
Add support for virus -e
jfy133 Jul 10, 2023
2e8a836
Update taxpasta version and add ganon support
jfy133 Jul 10, 2023
d72f2ba
Update CHANGELOG
jfy133 Jul 10, 2023
1b790e5
Merge branch 'dev' into print-tool-citations
jfy133 Jul 12, 2023
e5f7cce
Fix changelog
jfy133 Jul 12, 2023
dfff309
Start conmversoipn to lists
jfy133 Jul 12, 2023
d32774f
Add filtlong citation
jfy133 Jul 13, 2023
929fc4d
Switch to list based with better replaceAll
jfy133 Jul 13, 2023
bc5debb
Finalise text with DOI links!
jfy133 Jul 13, 2023
54b60ea
Merge branch 'dev' into print-tool-citations
jfy133 Jul 17, 2023
ee80255
Standardise and sync citation styles between citations.md and toolBib…
jfy133 Jul 17, 2023
4b5d82a
Update Mp3 to MP4 citation
jfy133 Jul 17, 2023
a19951d
And with citations
jfy133 Jul 17, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### `Added`

- New Classifers/Profilers
- [#298](https://github.com/nf-core/taxprofiler/pull/298) Added [ganon](https://pirovc.github.io/ganon/) (added by @jfy133)
- Additional functionality
- [#276](https://github.com/nf-core/taxprofiler/pull/276) Implemented batching in the KrakenUniq samples processing (added by @Midnighter)
- [#272](https://github.com/nf-core/taxprofiler/pull/272) Add saving of final 'analysis-ready-reads' to dedicated directory (❤️ to @alexhbnr for request, added by @jfy133)
- [#303](https://github.com/nf-core/taxprofiler/pull/303) Add support for taxpasta profile standardisation in single sample pipeline runs (❤️ to @artur-matysik for request, added by @jfy133)
- [#315](https://github.com/nf-core/taxprofiler/pull/315) Updated to nf-core pipeline template v2.9 (added by @sofstam & @jfy133)
- [#319](https://github.com/nf-core/taxprofiler/pull/319) Added support for virus hit expansion in Kaiju (❤️ to @dnlrxn for requesting, added by @jfy133)
- [#298](https://github.com/nf-core/taxprofiler/pull/298) **New classifier** [ganon](https://pirovc.github.io/ganon/) (added by @jfy133)
- [#276](https://github.com/nf-core/taxprofiler/pull/276) Implemented batching in the KrakenUniq samples processing (added by @Midnighter)
- [#272](https://github.com/nf-core/taxprofiler/pull/272) Add saving of final 'analysis-ready-reads' to dedicated directory (❤️ to @alexhbnr for request, added by @jfy133)
- [#303](https://github.com/nf-core/taxprofiler/pull/303) Add support for taxpasta profile standardisation in single sample pipeline runs (❤️ to @artur-matysik for request, added by @jfy133)
- [#315](https://github.com/nf-core/taxprofiler/pull/315) Updated to nf-core pipeline template v2.9 (added by @sofstam & @jfy133)
- [#308](https://github.com/nf-core/taxprofiler/pull/308) Add citations and bibliographic information to the MultiQC methods text of tools used in a given pipeline run (added by @jfy133)
- [#308](https://github.com/nf-core/taxprofiler/pull/308) Add citations and bibliographic information to the MultiQC methods text of tools used in a given pipeline run (added by @jfy133)
jfy133 marked this conversation as resolved.
Show resolved Hide resolved
- [#319](https://github.com/nf-core/taxprofiler/pull/319) Added support for virus hit expansion in Kaiju (❤️ to @dnlrxn for requesting, added by @jfy133)

### `Fixed`

Expand Down
8 changes: 7 additions & 1 deletion CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,9 @@

- [Porechop](https://github.com/rrwick/Porechop)

- [FILTLONG](https://github.com/rrwick/Filtlong)
> Wick, R. R., Judd, L. M., Gorrie, C. L., & Holt, K. E. (2017). Completing bacterial genome assemblies with multiplex MinION sequencing. Microbial Genomics, 3(10), e000132. https://doi.org/10.1099/mgen.0.000132

- [Filtlong](https://github.com/rrwick/Filtlong)

- [BBTools](http://sourceforge.net/projects/bbmap/)

Expand Down Expand Up @@ -100,6 +102,10 @@

> Ondov, Brian D., Nicholas H. Bergman, and Adam M. Phillippy. 2011. Interactive metagenomic visualization in a Web browser. BMC Bioinformatics 12 (1): 385. doi: 10.1186/1471-2105-12-385.

- [TAXPASTA](https://doi.org/10.21105/joss.05627)

> Beber, M. E., Borry, M., Stamouli, S., & Fellows Yates, J. A. (2023). TAXPASTA: TAXonomic Profile Aggregation and STAndardisation. Journal of Open Source Software, 8(87), 5627. https://doi.org/10.21105/joss.05627

## Software packaging/containerisation tools

- [Anaconda](https://anaconda.com)
Expand Down
170 changes: 148 additions & 22 deletions lib/WorkflowTaxprofiler.groovy
jfy133 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -47,56 +47,182 @@ class WorkflowTaxprofiler {
return yaml_file_text
}

//
// Generate methods description for MultiQC
//
///
/// Automatic publication methods text generation
///

public static String toolCitationText(params) {

// TODO Optionally add in-text citation tools to this list.
// Can use ternary operators to dynamically construct based conditions, e.g. params["run_xyz"] ? "Tool (Foo et al. 2023)" : "",
// Uncomment function in methodsDescriptionText to render in MultiQC report
def citation_text = [
"Tools used in the workflow included:",
"FastQC (Andrews 2010),",
"MultiQC (Ewels et al. 2016)",
def text_seq_qc = [
"Sequencing quality control was carried out with:",
params.preprocessing_qc_tool == "falco" ? "Falco (de Sena Brandine and Smith 2021)." : "FastQC (Andrews 2010)."
].join(' ').trim()


def text_shortread_qc = [
"Short read preprocessing was performed with:",
params.shortread_qc_tool == "adapterremoval" ? "AdapterRemoval (Schubert et al. 2016)." : "",
params.shortread_qc_tool == "fastp" ? "fastp (Chen et al. 2018)." : "",
].join(' ').trim()

def text_longread_qc = [
"Long read preprocessing was performed with:",
!params.longread_qc_skipadaptertrim ? "Porechop (Wick et al. 2017)," : "",
!params.longread_qc_skipqualityfilter ? "Filtlong (Wick 2021)," : "",
"."
].join(' ').trim()

def text_shortreadcomplexity = [
"Low-complexity sequence filtering was carried out with:",
params.shortread_complexityfilter_tool == "bbduk" ? "BBDuk (Bushnell 2022)." : "",
params.shortread_complexityfilter_tool == "prinseqplusplus" ? "PRINSEQ++ (Cantu et al. 2019)." : "",
params.shortread_complexityfilter_tool == "fastp" ? "fastp (Chen et al. 2018)." : "",
].join(' ').trim()

def text_shortreadhostremoval = [
"Host read removal was performed for short reads with Bowtie2 (Langmead and Salzberg 2012) and SAMtools (Danecek et al. 2021)."
].join(' ').trim()

def text_longreadhostremoval = [
"Host read removal was performed for long reads with minimap2 (Li et al. 2018) and SAMtools (Danecek et al. 2021)."
].join(' ').trim()


def text_classification = [
"Taxonomic classification or profiling was carried out with:",
params.run_bracken ? "Bracken (Lu et al. 2017)," : "",
params.run_kraken2 ? "Kraken2 (Wood et al. 2019)," : "",
params.run_krakenuniq ? "KrakenUniq (Breitwieser et al. 2018)," : "",
params.run_metaphlan3 ? "MetaPhlAn3 (Beghini et al. 2021)," : "",
params.run_malt ? "MALT (Vågene et al. 2018) and MEGAN6 CE (Huson et al. 2016)," : "",
params.run_diamond ? "DIAMOND (Buchfink et al. 2015)," : "",
params.run_centrifuge ? "Centrifuge (Kim et al. 2016)," : "",
params.run_kaiju ? "Kaiju (Menzel et al. 2016)," : "",
params.run_motus ? "mOTUs (Ruscheweyh et al. 2022)," : "",
params.run_ganon ? "ganon (Piro et al. 2020)" : "",
"."
].join(' ').trim()

def text_visualisation = [
"Visualisation of results, where supported, was performed with Krona (Ondov et al. 2011)."
].join(' ').trim()

def text_postprocessing = [
"Standardisation of taxonomic profiles was carried out with TAXPASTA (Beber et al. 2023).",
].join(' ').trim()

def citation_text = [
text_seq_qc,
params.perform_shortread_qc ? text_shortread_qc : "",
params.perform_longread_qc ? text_longread_qc : "",
params.perform_shortread_complexityfilter ? text_shortreadcomplexity : "",
params.perform_shortread_hostremoval ? text_shortreadhostremoval : "",
params.perform_longread_hostremoval ? text_longreadhostremoval : "",
text_classification,
params.run_krona ? text_visualisation : "",
params.run_profile_standardisation ? text_postprocessing : "",
"Pipeline results statistics were summarised with MultiQC (Ewels et al. 2016)."
].join(' ').trim().replaceAll("[,|.] +\\.", ".")

return citation_text
}

public static String toolBibliographyText(params) {

// TODO Optionally add bibliographic entries to this list.
// Can use ternary operators to dynamically construct based conditions, e.g. params["run_xyz"] ? "<li>Author (2023) Pub name, Journal, DOI</li>" : "",
// Uncomment function in methodsDescriptionText to render in MultiQC report
def reference_text = [
"<li>Andrews S, (2010) FastQC, URL: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/).</li>",
"<li>Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics , 32(19), 3047–3048. doi: /10.1093/bioinformatics/btw354</li>"
def text_seq_qc = [
params.preprocessing_qc_tool == "falco" ? "<li>de Sena Brandine G and Smith AD. Falco: high-speed FastQC emulation for quality control of sequencing data. F1000Research 2021, 8:1874 doi: <a href=\"https://doi.org/10.12688/f1000research.21142.2\">10.12688/f1000research.21142.2</li>" : "",
params.preprocessing_qc_tool == "fastqc" ? "<li>Andrews S, (2010) FastQC, URL: <a href=\"https://www.bioinformatics.babraham.ac.uk/projects/fastqc/\">https://www.bioinformatics.babraham.ac.uk/projects/fastqc/</a></li>" : "",
].join(' ').trim()


def text_shortread_qc = [
params.shortread_qc_tool == "adapterremoval" ? "<li>Schubert, Mikkel, Stinus Lindgreen, and Ludovic Orlando. 2016. AdapterRemoval v2: Rapid Adapter Trimming, Identification, and Read Merging. BMC Research Notes 9 (February): 88. doi: <a href=\"https://doi.org/10.1186/s13104-016-1900-2\">10.1186/s13104-016-1900-2</a></li>" : "",
jfy133 marked this conversation as resolved.
Show resolved Hide resolved
].join(' ').trim()

def text_longread_qc = [
!params.longread_qc_skipadaptertrim ? "<li>Wick, R. R., Judd, L. M., Gorrie, C. L., & Holt, K. E. (2017). Completing bacterial genome assemblies with multiplex MinION sequencing. Microbial Genomics, 3(10), e000132. doi: <a href=\"https://doi.org/10.1099/mgen.0.000132\">10.1099/mgen.0.000132</a></li>" : "",
!params.longread_qc_skipqualityfilter ? "<li>Wick R (2021) Filtlong, URL: <a href=\"https://github.com/rrwick/Filtlong\">https://github.com/rrwick/Filtlong</a></li>" : ""
].join(' ').trim()

// TODO FASTP DUPLCIATE
def text_shortreadcomplexity = [
params.shortread_complexityfilter_tool == "bbduk" ? "<li>Bushnell B (2022) BBMap, URL: <a href=\"http://sourceforge.net/projects/bbmap/\">http://sourceforge.net/projects/bbmap/</a></li>" : "",
params.shortread_complexityfilter_tool == "prinseqplusplus" ? "<li>Cantu, Vito Adrian, Jeffrey Sadural, and Robert Edwards. 2019. PRINSEQ++, a Multi-Threaded Tool for Fast and Efficient Quality Control and Preprocessing of Sequencing Datasets. e27553v1. PeerJ Preprints. doi: <a href=\"https://doi.org/10.7287/peerj.preprints.27553v1\">10.7287/peerj.preprints.27553v1</a></li>" : "",
jfy133 marked this conversation as resolved.
Show resolved Hide resolved
].join(' ').trim()

def text_shortreadhostremoval = [
"<li>Langmead, B., & Salzberg, S. L. (2012). Fast gapped-read alignment with Bowtie 2. Nature Methods, 9(4), 357–359. <a href=\"https://doi.org/10.1038/nmeth.1923\">10.1038/nmeth.1923</a></li>",
].join(' ').trim()

def text_longreadhostremoval = [
"<li>Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics , 34(18), 3094–3100. doi: <a href=\"https://doi.org/10.1093/bioinformatics/bty191\">10.1093/bioinformatics/bty191</a></li>",
jfy133 marked this conversation as resolved.
Show resolved Hide resolved
].join(' ').trim()


def text_classification = [
params.run_bracken ? "<li>Lu, J., Breitwieser, F. P., Thielen, P., & Salzberg, S. L. (2017). Bracken: Estimating species abundance in metagenomics data. PeerJ Computer Science, 3, e104. doi: <a href=\"https://doi.org/10.7717/peerj-cs.104\">10.7717/peerj-cs.104</a></li>" : "",
params.run_kraken2 ? "<li>Wood, Derrick E., Jennifer Lu, and Ben Langmead. 2019. Improved Metagenomic Analysis with Kraken 2. Genome Biology 20 (1): 257. doi: <a href=\"https://doi.org/10.1186/s13059-019-1891-0\">10.1186/s13059-019-1891-0</a></li>" : "",
params.run_krakenuniq ? "<li>Breitwieser, Florian P., Daniel N. Baker, and Steven L. Salzberg. 2018. KrakenUniq: confident and fast metagenomics classification using unique k-mer counts. Genome Biology 19 (1): 198. doi: <a href=\"https://doi.org/10.1186/s13059-018-1568-0\">10.1186/s13059-018-1568-0</a></li>" : "",
params.run_metaphlan3 ? "<li>Beghini, Francesco, Lauren J McIver, Aitor Blanco-Míguez, Leonard Dubois, Francesco Asnicar, Sagun Maharjan, Ana Mailyan, et al. 2021. “Integrating Taxonomic, Functional, and Strain-Level Profiling of Diverse Microbial Communities with BioBakery 3.” ELife 10 (May): e65088. doi: <a href=\"https://doi.org/10.7554/eLife.65088\">10.7554/eLife.65088</a></li>" : "",
params.run_malt ? "<li>Vågene, Åshild J., Alexander Herbig, Michael G. Campana, Nelly M. Robles García, Christina Warinner, Susanna Sabin, Maria A. Spyrou, et al. 2018. Salmonella Enterica Genomes from Victims of a Major Sixteenth-Century Epidemic in Mexico. Nature Ecology & Evolution 2 (3): 520-28. doi: <a href=\"https://doi.org/10.1038/s41559-017-0446-6\">10.1038/s41559-017-0446-6</a></li>" : "",
params.run_malt ? "<li>Huson, Daniel H., Sina Beier, Isabell Flade, Anna Górska, Mohamed El-Hadidi, Suparna Mitra, Hans-Joachim Ruscheweyh, and Rewati Tappu. 2016. “MEGAN Community Edition - Interactive Exploration and Analysis of Large-Scale Microbiome Sequencing Data.” PLoS Computational Biology 12 (6): e1004957. doi: <a href=\"https://doi.org/10.1371/journal.pcbi.1004957\">10.1371/journal.pcbi.1004957</a></li>" : "",
params.run_diamond ? "<li>Buchfink, Benjamin, Chao Xie, and Daniel H. Huson. 2015. “Fast and Sensitive Protein Alignment Using DIAMOND.” Nature Methods 12 (1): 59-60. doi: <a href=\"https://doi.org/10.1038/nmeth.3176\">10.1038/nmeth.3176</a></li>" : "",
params.run_centrifuge ? "<li>Kim, Daehwan, Li Song, Florian P. Breitwieser, and Steven L. Salzberg. 2016. “Centrifuge: Rapid and Sensitive Classification of Metagenomic Sequences.” Genome Research 26 (12): 1721-29. doi: <a href=\"https://doi.org/10.1101/gr.210641.116\">10.1101/gr.210641.116</a></li>" : "",
params.run_kaiju ? "<li>Menzel, P., Ng, K. L., & Krogh, A. (2016). Fast and sensitive taxonomic classification for metagenomics with Kaiju. Nature Communications, 7, 11257. doi: <a href=\"https://doi.org/10.1038/ncomms11257\">10.1038/ncomms11257</a></li>" : "",
params.run_motus ? "<li>Ruscheweyh, H.-J., Milanese, A., Paoli, L., Karcher, N., Clayssen, Q., Keller, M. I., Wirbel, J., Bork, P., Mende, D. R., Zeller, G., & Sunagawa, S. (2022). Cultivation-independent genomes greatly expand taxonomic-profiling capabilities of mOTUs across various environments. Microbiome, 10(1), 212. doi: <a href=\"https://doi.org/10.1186/s40168-022-01410-z\">10.1186/s40168-022-01410-z</a></li>" : "",
params.run_ganon ? "<li>Piro, V. C., Dadi, T. H., Seiler, E., Reinert, K., & Renard, B. Y. (2020). Ganon: Precise metagenomics classification against large and up-to-date sets of reference sequences. Bioinformatics (Oxford, England), 36(Suppl_1), i12–i20. <a href=\"https://doi.org/10.1093/bioinformatics/btaa458\">10.1093/bioinformatics/btaa458</a></li>" : "",
].join(' ').trim()

def text_visualisation = [
"<li>Ondov, Brian D., Nicholas H. Bergman, and Adam M. Phillippy. 2011. Interactive metagenomic visualization in a Web browser. BMC Bioinformatics 12 (1): 385. doi: <a href=\"https://doi.org/10.1186/1471-2105-12-385\">10.1186/1471-2105-12-385</a></li>"
].join(' ').trim()

def text_postprocessing = [
"<li>Beber, M. E., Borry, M., Stamouli, S., & Fellows Yates, J. A. (2023). TAXPASTA: TAXonomic Profile Aggregation and STAndardisation. Journal of Open Source Software, 8(87), 5627. <a href=\"https://doi.org/10.21105/joss.05627\">10.21105/joss.05627</a></li>",
].join(' ').trim()

def text_extras = [
// fastp shortread qc / complexity filtering
( params.perform_shortread_qc && params.shortread_qc_tool == "fastp" ) || ( params.text_shortreadcomplexity && params.shortread_complexityfilter_tool == "fastp" ) ? "<li>Chen, Shifu, Yanqing Zhou, Yaru Chen, and Jia Gu. 2018. Fastp: An Ultra-Fast All-in-One FASTQ Preprocessor. Bioinformatics 34 (17): i884-90. <a href=\"https://doi.org/10.1093/bioinformatics/bty560\">10.1093/bioinformatics/bty560</a></li>" : "",
// samtools long / short hostremoval
params.perform_shortread_hostremoval || params.perform_longread_hostremoval ? "<li>Danecek, P., Bonfield, J. K., Liddle, J., Marshall, J., Ohan, V., Pollard, M. O., Whitwham, A., Keane, T., McCarthy, S. A., Davies, R. M., & Li, H. (2021). Twelve years of SAMtools and BCFtools. GigaScience, 10(2). <a href=\"https://doi.org/10.1093/gigascience/giab008\">10.1093/gigascience/giab008</a></li>" : "",
].join(' ').trim()

def reference_text = [
text_seq_qc,
params.perform_shortread_qc ? text_shortread_qc : "",
params.perform_longread_qc ? text_longread_qc : "",
params.perform_shortread_complexityfilter ? text_shortreadcomplexity : "",
params.perform_shortread_hostremoval ? text_shortreadhostremoval : "",
params.perform_longread_hostremoval ? text_longreadhostremoval : "",
text_extras,
text_classification,
params.run_krona ? text_visualisation : "",
params.run_profile_standardisation ? text_postprocessing : "",
"<li>Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: <a href=\"https:/doi.org/10.1093/bioinformatics/btw354\">10.1093/bioinformatics/btw354.</li>"
].join(' ').trim().replaceAll("[,|.] +\\.", ".")

return reference_text
}

public static String methodsDescriptionText(run_workflow, mqc_methods_yaml, params) {
// Convert to a named map so can be used as with familar NXF ${workflow} variable syntax in the MultiQC YML file
def meta = [:]

meta.workflow = run_workflow.toMap()
meta["manifest_map"] = run_workflow.manifest.toMap()

// Pipeline DOI
meta["doi_text"] = meta.manifest_map.doi ? "(doi: <a href=\'https://doi.org/${meta.manifest_map.doi}\'>${meta.manifest_map.doi}</a>)" : ""
meta["nodoi_text"] = meta.manifest_map.doi ? "": "<li>If available, make sure to update the text to include the Zenodo DOI of version of the pipeline used. </li>"

// Tool references
meta["tool_citations"] = ""
meta["tool_bibliography"] = ""

// TODO Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
//meta["tool_citations"] = toolCitationText(params).replaceAll(", \\.", ".").replaceAll("\\. \\.", ".").replaceAll(", \\.", ".")
//meta["tool_bibliography"] = toolBibliographyText(params)

/*
TODO Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
*/
meta["tool_citations"] = toolCitationText(params)
meta["tool_bibliography"] = toolBibliographyText(params)

def methods_text = mqc_methods_yaml.text

Expand Down