Juno 🦟🦠🧬📊 - A Nextflow Pipeline for Reference-Based Assembly of Oropouche Virus (OROV) Genomes

Juno is designed for processing Illumina paired-end metagenomics sequencing data against OROV reference genomes, performing QC, taxonomic classification, alignment, variant calling, and consensus generation.

⚡ Usage

$ nextflow run juno.nf -profile singularity -params-file params.yaml

🐊 HiPerGator Usage

$ sbatch ./juno.sh

📦 Dependencies

Nextflow 23.04.0+
Singularity or Docker
Python 3.6+
Slurm (only if HiPerGator will be used)

⚙️ Configuration

1. Clone this repository

git clone https://github.com/BPHL-Molecular/Juno.git
cd Juno

2. Create a directory for Input FASTQ Files

mkdir fastq
# move or copy your FASTQ files into this directory

3. Set required parameters:

Important: All pipeline parameters must be set in the params.yaml file. Make sure you edit this file to provide the correct paths and values before running the pipeline.

You will also need to download the kraken2/bracken viral database from the BenLangmead Index zone link.

# Input/Output paths
input_dir: "/path/to/fastq"
output_dir: "/path/to/output_dir"

# References path, default reference directory, DO NOT change.
refs_dir: "${projectDir}/references"

# Database path
kraken2_db: "/path/to/kraken2_db"

# Resource configuration, default number of threads per process
threads: 32

# Human scrubber processing option, set to true for HPC environments
parallel_hrrt: false

# Quality control thresholds
qc_thresholds:
    min_coverage: 90
    min_depth: 15

Please see the notes on the references sequences used in this pipeline.

🛠️ Pipeline Steps

Quality Control
- Human Read Removal - sra-human-scrubber
- Read QC and trimming - fastp
Taxonomic Classification
- Read classification - kraken2
Assembly
- Reference alignment - bwa
- SAM/BAM processing - samtools
- Variant calling & consensus - ivar
Quality Assessment
- Assembly evaluation - quast
- Report generation - multiqc

📂 Output Structure

output_dir/
├── dehosted/         # Cleaned reads
├── trimmed/          # Trimmed reads
├── kraken2/          # Classification results
├── alignments/       # SAM/BAM files & indices
├── stats/            # Alignment statistics
├── variants/         # Variant calls
├── consensus/        # Consensus sequences
├── quast/            # Assembly metrics
├── multiqc/          # Combined QC report
└── summary_report.tsv

📋 Summary Report Metrics

Sample and reference identifiers
Cleaned read counts
Classification read counts
Mapping statistics
Coverage metrics
Variant counts
Assembly quality metrics
Overall QC status

🐛 Troubleshooting

Pipeline Errors:
Check Nextflow execution logs in .nextflow.log

Low Coverage Regions:
Regions with low coverage (<10x) will be filled with 'N' in consensus sequences.

Quality Thresholds:
Default quality thresholds can be modified in params.yaml as needed.

🤝 Contributing

We welcome contributions to make Juno better! Feel free to open issues or submit pull requests to suggest any additional features or enhancements!

📧 Contact

Email: bphl-sebioinformatics@flhealth.gov

⚖️ License

Juno is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
modules		modules
references		references
LICENSE		LICENSE
README.md		README.md
juno.nf		juno.nf
juno.sh		juno.sh
nextflow.config		nextflow.config
params.yaml		params.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Juno 🦟🦠🧬📊 - A Nextflow Pipeline for Reference-Based Assembly of Oropouche Virus (OROV) Genomes

⚡ Usage

🐊 HiPerGator Usage

📦 Dependencies

⚙️ Configuration

1. Clone this repository

2. Create a directory for Input FASTQ Files

3. Set required parameters:

Please see the notes on the references sequences used in this pipeline.

🛠️ Pipeline Steps

📂 Output Structure

📋 Summary Report Metrics

🐛 Troubleshooting

🤝 Contributing

📧 Contact

⚖️ License

About

Releases

Packages

Languages

License

BPHL-Molecular/Juno

Folders and files

Latest commit

History

Repository files navigation

Juno 🦟🦠🧬📊 - A Nextflow Pipeline for Reference-Based Assembly of Oropouche Virus (OROV) Genomes

⚡ Usage

🐊 HiPerGator Usage

📦 Dependencies

⚙️ Configuration

1. Clone this repository

2. Create a directory for Input FASTQ Files

3. Set required parameters:

Please see the notes on the references sequences used in this pipeline.

🛠️ Pipeline Steps

📂 Output Structure

📋 Summary Report Metrics

🐛 Troubleshooting

🤝 Contributing

📧 Contact

⚖️ License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages