Skip to content

peterk87/scovtree

 
 

Repository files navigation

nhhaidee/scovtree

Phylogenetic Analysis for SARS-COV2.

Nextflow

install with bioconda Docker Get help on Slack

Introduction

nhhaidee/scovtree is a bioinformatics pipeline for sars-cov2 phylogenetic analysis, given a consensus sequences the workflow will output phylogenetic tree and SNP information. The pipeline also allows to filter and find the most related sequences in GISAID. The GISAID filters workflow will output filtered sequences and metadata in old format (GISAID changed format of metadata recently) so the output then can be used with Nextstrain locally.

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible.

Quick Start

  1. Install nextflow

  2. Install any of Docker, Singularity for full pipeline reproducibility (please only use Conda as a last resort; see docs)

  3. Download the pipeline and test it on a minimal dataset with a single command:

    nextflow run nhhaidee/scovtree -profile test_gisaid_full,<docker/singularity/conda>
    nextflow run nhhaidee/scovtree -profile test_gisaid_drop_columns,<docker/singularity/conda>
    nextflow run nhhaidee/scovtree -profile test,<docker/singularity/conda>
  4. Start running your own analysis!

    • Typical command for phylogenetic analysis is as follow:

      nextflow run nhhaidee/scovtree -profile <docker/singularity/conda> \
          --filter_gisaid false \
          --input '/path/to/consensus/consensus_sequences.fasta'
    • Typical command for phylogenetic analysis with GISAID Sequences is as follow:

      nextflow run nhhaidee/scovtree -profile <docker/singularity/conda> \
          --filter_gisaid true \
          --gisaid_sequences /path/to/sequences.fasta \
          --gisaid_metadata /path/to/metadata.tsv \
          --input '/path/to/consensus/consensus_sequences.fasta'

Credits

nhhaidee/scovtree was originally written by Hai Nguyen.

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

For further information or help, don't hesitate to get in touch on the Slack #scovtree channel (you can join with this invite).

Citations

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

In addition, references of tools and data used in this pipeline are as follows:

About

No description, website, or topics provided.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Nextflow 60.0%
  • Python 21.4%
  • Groovy 14.6%
  • R 2.3%
  • HTML 1.4%
  • Dockerfile 0.3%