diff --git a/README.md b/README.md index a2b6c52..0569fc2 100644 --- a/README.md +++ b/README.md @@ -8,9 +8,17 @@ Canonically pronouced *nice* compositional statistics and visualization toolbox +gneiss is a compositional statistics and visualization toolbox. + +# Examples + +IPython notebooks demonstrating some of the modules in gneiss can be found as follows + +* [What are balances](https://github.com/biocore/gneiss/blob/master/ipynb/balance_trees.ipynb) +* [Linear regression on balances in the 88 soils](https://github.com/biocore/gneiss/blob/master/ipynb/88soils.ipynb) + Note that gneiss is not compatible with python 2, and is compatible with Python 3.4 or later. -gneiss is currently in alpha. We are actively developing it, and __backward-incompatible interface changes can and will arise__. +gneiss is currently in alpha. We are actively developing it, and __backward-incompatible interface changes may arise__. # Installation @@ -33,11 +41,3 @@ source activate gneiss conda install seaborn h5py pip install biom-format ``` - -# Examples - -IPython notebooks demonstrating some of the modules in gneiss can be found as follows - -* [What are balances](https://github.com/biocore/gneiss/blob/master/ipynb/balance_trees.ipynb) -* [Linear regression on balances in the 88 soils](https://github.com/biocore/gneiss/blob/master/ipynb/88soils.ipynb) - diff --git a/ipynb/88soils.ipynb b/ipynb/88soils.ipynb index ff47983..a10a5de 100644 --- a/ipynb/88soils.ipynb +++ b/ipynb/88soils.ipynb @@ -65,8 +65,8 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now we will be filtering out OTUs that are lower abundance. We'll set the threshold to be 100 reads.\n", - "This is because there is going to be a lot of garbage OTUs due to contamination, sequencing error, or clustering errors. We feel that the 100 read filter is conservative enough to demonstrate the utility of this tool." + "Now we will be filtering out OTUs that are lower abundance. We'll set the threshold to be __100__ reads.\n", + "This is because there is going to be a lot of garbage OTUs due to contamination, sequencing error, or clustering errors. We feel that the 100 read filter is conservative enough to demonstrate the utility of this tool. Of course, different filtering criteria will be carefully chosen by the user, as an appropriate choice of filter will depend on the study." ] }, { @@ -191,8 +191,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We'll want to fit a quartic linear model to each of the balances individually with respect to pH. For now, we'll\n", - "define 3 more variables encoding different powers of pH as follows." + "We'll want to fit a quartic linear model to each of the balances individually with respect to pH. This model was chosen, because in the original paper, there was a horseshoe shape observed. So, it would make sense to use a parabolic function to fit each balance. Empirically, we found that a 4th degree polynomial gave the best results. For now, we'll define 3 more variables encoding different powers of pH as follows." ] }, { @@ -230,7 +229,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Finally, we'll want to sort both the rows and columns the original table according to the pH gradient. We've defined\n", + "Finally, we'll want to __sort__ both the rows and columns the original table according to the pH gradient. We've defined\n", "the function [`niche_sort`](https://github.com/biocore/gneiss/blob/master/gneiss/sort.py#L67) to handle this." ] }, @@ -329,6 +328,23 @@ "source": [ "sns.heatmap(predicted_table.T, robust=True)" ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "From this, it is clear that the linear regression on balances can capture the overall trends of OTUs vs pH. \n", + "Ecologically, this makes sense. There cannot exist bacteria that are optimized to live in every possible pH environment. So it isn't entirely surprising that microbial abundances can be predicted from pH. At the same time, the pattern was not obviously apparent until linear regressions on balances were applied." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [] } ], "metadata": {