Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated citation, description and installation guide. #145

Merged
merged 2 commits into from
Apr 7, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
[![CI Status](https://github.com/qiyunzhu/woltka/actions/workflows/main.yml/badge.svg)](https://github.com/qiyunzhu/woltka/actions)
[![Coverage Status](https://coveralls.io/repos/github/qiyunzhu/woltka/badge.svg?branch=master)](https://coveralls.io/github/qiyunzhu/woltka?branch=master)

**Woltka** (Web of Life Toolkit App), is a bioinformatics package for shotgun metagenome data analysis. It takes full advantage of, and is not limited by, the [WoL](https://biocore.github.io/wol/) reference phylogeny. It bridges first-pass sequence aligners with advanced analytical platforms (such as QIIME 2). Highlights of this program include:
**Woltka** is a versatile program for determining the composition and functional capacity of microbiomes. It mainly works with shotgun metagenomic data. It bridges first-pass sequence aligners with advanced analytical platforms (such as QIIME 2). It takes full advantage of, and is not limited by, the [WoL](https://biocore.github.io/wol/) reference database. Highlights of this program include:

- OGU: fine-grained community ecology.
- [OGU](https://journals.asm.org/doi/10.1128/msystems.00167-22): fine-grained community ecology.
- Tree-based, rank-free classification.
- Combined taxonomic & functional analysis.

Expand Down Expand Up @@ -62,7 +62,7 @@ Woltka does NOT **analyze** profiles. We recommend using [QIIME 2](https://qiime

## Installation

Requirement: [Python](https://www.python.org/) 3.6 or above, with Python package [biom-format](http://biom-format.org/) installed.
Requirement: [Python](https://www.python.org/) 3.6 or above.

```bash
pip install woltka
Expand Down Expand Up @@ -133,11 +133,11 @@ One can also combine taxonomic and functional profilings in a **stratification**

## Citation

The first manuscript describing Woltka has been preprinted at:
The first paper describing Woltka was published at:

- Zhu Q, Huang S, Gonzalez A, McGrath I, McDonald D, Haiminen N, et al. [OGUs enable effective, phylogeny-aware analysis of even shallow metagenome community structures.](https://www.biorxiv.org/content/10.1101/2021.04.04.438427v1) _bioRxiv_. 2021. doi: https://doi.org/10.1101/2021.04.04.438427.
- Zhu Q, Huang S, Gonzalez A, McGrath I, McDonald M, Haiminen N, Armstrong G, et al. [Phylogeny-aware analysis of metagenome community ecology based on matched reference genomes while bypassing taxonomy.](https://journals.asm.org/doi/10.1128/msystems.00167-22) _mSystems_. 2022. e00167-22.

Note: This manuscript focuses on the [OGU analysis](doc/ogu.md). Although it does not discuss other functions of Woltka, it is so far the only citable article if you use Woltka in your studies.
Note: This paper focuses on the [OGU analysis](doc/ogu.md). Although it does not discuss other functions of Woltka, it is so far the only citable paper if you use Woltka in your studies.


## Contact
Expand Down
1 change: 0 additions & 1 deletion doc/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ We recommend [Conda](https://docs.conda.io/en/latest/) for managing Python versi
```bash
conda create -n woltka python=3
conda activate woltka
conda install -c conda-forge cython biom-format
```

If you already have a [QIIME 2](https://qiime2.org/) environment, these steps can be omitted as the dependencies are already included. See [details](../woltka/q2).
Expand Down
7 changes: 4 additions & 3 deletions doc/ogu.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,12 @@

The notion of “**OGU**” (operational genomic unit) is the minimal unit for community ecology studies based on shotgun metagenome or other forms of whole-genome microbiome data. OGUs are simply the reference genomes to which input sequences are aligned. There is no need to assign taxonomy to them. This is in constrast to conventional practices, in which analyses are based on taxonomic units such as genera or species. Therefore, OGU is analogous to ASV in 16S rRNA studies.

The advantage of using OGUs includes 1) highest-possible resolution, 2) independent from taxonomy which is coarse and error-prone as a classification system. 3) allowing for phylogeny-based analysis such as Faith’s PD and UniFrac. The last part is enhanced by the Web of Life ([WoL](https://biocore.github.io/wol/)) reference phylogeny.
The advantage of using OGUs includes 1) highest-possible resolution, 2) independent from taxonomy which is coarse and error-prone as a classification system. 3) allowing for phylogeny-based analysis such as Faith’s PD and UniFrac. The last part is enhanced by the "Web of Life" ([WoL](https://biocore.github.io/wol/)) reference phylogeny.

Our manuscript introducing the OGU analysis has been preprinted at:
The OGU analysis was explained, benchmarked and discussed in:

- Zhu Q, Huang S, Gonzalez A, McGrath I, McDonald M, Haiminen N, Armstrong G, et al. [Phylogeny-aware analysis of metagenome community ecology based on matched reference genomes while bypassing taxonomy.](https://journals.asm.org/doi/10.1128/msystems.00167-22) _mSystems_. 2022. e00167-22.

- Zhu Q, Huang S, Gonzalez A, McGrath I, McDonald D, Haiminen N, et al. [OGUs enable effective, phylogeny-aware analysis of even shallow metagenome community structures.](https://www.biorxiv.org/content/10.1101/2021.04.04.438427v1) _bioRxiv_. 2021. doi: https://doi.org/10.1101/2021.04.04.438427.


## Contents
Expand Down
2 changes: 1 addition & 1 deletion doc/wol.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ fastp -l 100 -i forward.fastq -I reverse.fastq -o forward_trimmed.fastq -O rever

## The OGU analysis

**OGU** (operational genomic unit) ([Zhu et al., 2021](https://www.biorxiv.org/content/10.1101/2021.04.04.438427v1)) is a notion we proposed to define the minimum unit of microbiome composition allowed by shotgun metagenomic data. OGUs are reference genomes to which any input sequences have matches. This maximizes the resolution of microbiome composition, and allows for phylogeny-aware analyses using the WoL reference phylogeny. See [details](ogu.md).
**OGU** (operational genomic unit) ([Zhu et al., 2022](https://journals.asm.org/doi/10.1128/msystems.00167-22)) is a notion we proposed to define the minimum unit of microbiome composition allowed by shotgun metagenomic data. OGUs are reference genomes to which any input sequences have matches. This maximizes the resolution of microbiome composition, and allows for phylogeny-aware analyses using the WoL reference phylogeny. See [details](ogu.md).

In Woltka, an OGU analysis is simply a classification process without a classification system. In such case, the output features will just be the subjects in the alignment file, namely reference genomes.

Expand Down