Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MRG: fix tutorial notebook links #2633

Merged
merged 9 commits into from
Jun 22, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion binder/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,12 @@ channels:
- bioconda
- defaults
dependencies:
- sourmash
- python>=3.9
- sourmash>=4.8.2
- screed
- matplotlib
- pandas
- pip
- pip:
- matplotlib_venn
- mmh3
2 changes: 1 addition & 1 deletion doc/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ be stored, searched, explored, and taxonomically annotated.
You can take a look at sourmash analyses on real data
[in a saved Jupyter notebook](https://github.com/sourmash-bio/sourmash/blob/latest/doc/sourmash-examples.ipynb),
and experiment with it yourself
[interactively in a Jupyter Notebook](https://mybinder.org/v2/gh/sourmash-bio/sourmash/latest?filepath=doc%2Fsourmash-examples.ipynb)
[interactively in a Jupyter Notebook](https://mybinder.org/v2/gh/sourmash-bio/sourmash/latest?labpath=doc%2Fsourmash-examples.ipynb)
at [mybinder.org](http://mybinder.org).

## Installing sourmash
Expand Down
28 changes: 14 additions & 14 deletions doc/kmers-and-minhash.ipynb

Large diffs are not rendered by default.

114 changes: 76 additions & 38 deletions doc/plotting-compare.ipynb

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion doc/sourmash-collections.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"### Running this notebook.\n",
"\n",
"You can run this notebook interactively via mybinder; click on this button:\n",
"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dib-lab/sourmash/latest?filepath=doc%2Fsourmash-collections.ipynb)\n",
"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dib-lab/sourmash/latest?labpath=doc%2Fsourmash-collections.ipynb)\n",
"\n",
"A rendered version of this notebook is available at [sourmash.readthedocs.io](https://sourmash.readthedocs.io) under \"Tutorials and notebooks\".\n",
"\n",
Expand Down
6 changes: 3 additions & 3 deletions doc/sourmash-examples.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"### Running this notebook.\n",
"\n",
"You can run this notebook interactively via mybinder; click on this button:\n",
"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dib-lab/sourmash/latest?filepath=doc%2Fsourmash-examples.ipynb)\n",
"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dib-lab/sourmash/latest?labpath=doc%2Fsourmash-examples.ipynb)\n",
"\n",
"A rendered version of this notebook is available at [sourmash.readthedocs.io](https://sourmash.readthedocs.io) under \"Tutorials and notebooks\".\n",
"\n",
Expand Down Expand Up @@ -510,9 +510,9 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python (myenv)",
"display_name": "smash-notebooks",
"language": "python",
"name": "myenv"
"name": "smash-notebooks"
},
"language_info": {
"codemirror_mode": {
Expand Down
16 changes: 8 additions & 8 deletions doc/tutorials.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,22 +13,22 @@ X and Linux. They require about 5 GB of disk space and 5 GB of RAM.

## Background and details

These next four tutorials are all notebooks that you can view, run
These next three tutorials are all notebooks that you can view, run
yourself, or run interactively online via the
[binder](https://mybinder.org) service.

* [An introduction to k-mers for genome comparison and analysis](kmers-and-minhash.md)
* [An introduction to k-mers for genome comparison and analysis.](kmers-and-minhash.ipynb)

* [Some sourmash command line examples!](sourmash-examples.md)
* [Some sourmash command line examples!](sourmash-examples.ipynb)

* [Working with private collections of signatures.](sourmash-collections.md)
* [Working with private collections of signatures.](sourmash-collections.ipynb)

* [Using `sourmash taxonomy` with the LIN taxonomic framework.](tutorial-lin-taxonomy.md)

## More information
## Advanced tutorials and more information

For more information on analyzing sequencing data with sourmash, check out our [longer tutorial](tutorial-long.md).

Read [using `sourmash taxonomy` with the Life Identification Number (LIN) taxonomic framework](tutorial-lin-taxonomy.md) for some of our newer taxonomic features.

If you are a Python programmer, you might also be interested in our [API examples](api-example.md) as well as a short guide to [Using the `LCA_Database` API.](using-LCA-database-API.ipynb)

If you prefer R, we have [a short guide to using sourmash output with R](other-languages.md).
Expand All @@ -37,7 +37,7 @@ If you prefer R, we have [a short guide to using sourmash output with R](other-l

If you're interested in customizing the output of `sourmash plot`,
which produces comparison matrices and dendrograms, please see
[Building plots from `sourmash compare` output](plotting-compare.md).
[Building plots from `sourmash compare` output](plotting-compare.ipynb).

## Contents:

Expand Down
62 changes: 27 additions & 35 deletions doc/using-LCA-database-API.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
"### Running this notebook.\n",
"\n",
"You can run this notebook interactively via mybinder; click on this button:\n",
"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dib-lab/sourmash/latest?filepath=doc%2Fusing-LCA-database-API.ipynb)\n",
"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dib-lab/sourmash/latest?labpath=doc%2Fusing-LCA-database-API.ipynb)\n",
"\n",
"A rendered version of this notebook is available at [sourmash.readthedocs.io](https://sourmash.readthedocs.io) under \"Tutorials and notebooks\".\n",
"\n",
Expand Down Expand Up @@ -74,14 +74,14 @@
"text": [
"\r",
"\u001b[K\r\n",
"== This is sourmash version 4.0.0a4.dev12+g31c5eda2. ==\r\n",
"== This is sourmash version 4.8.2. ==\r\n",
"\r",
"\u001b[K== Please cite Brown and Irber (2016), doi:10.21105/joss.00027. ==\r\n",
"\r\n",
"\r",
"\u001b[Kcomputing signatures for files: genomes/akkermansia.fa, genomes/shew_os185.fa, genomes/shew_os223.fa\r\n",
"\r",
"\u001b[KComputing a total of 1 signature(s).\r\n",
"\u001b[KComputing a total of 1 signature(s) for each input.\r\n",
"\r",
"\u001b[Kskipping genomes/akkermansia.fa - already done\r\n",
"\r",
Expand Down Expand Up @@ -144,9 +144,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"[(1.0,\n",
" SourmashSignature('CP001071.1 Akkermansia muciniphila ATCC BAA-835, complete genome', 6822e0b7),\n",
" None)]\n"
"[Result(score=1.0, signature=SourmashSignature('CP001071.1 Akkermansia muciniphila ATCC BAA-835, complete genome', 6822e0b7), location=None)]\n"
]
}
],
Expand All @@ -164,12 +162,8 @@
"name": "stdout",
"output_type": "stream",
"text": [
"[(1.0,\n",
" SourmashSignature('NC_009665.1 Shewanella baltica OS185, complete genome', b47b13ef),\n",
" None),\n",
" (0.22846441947565543,\n",
" SourmashSignature('NC_011663.1 Shewanella baltica OS223, complete genome', ae6659f6),\n",
" None)]\n"
"[Result(score=1.0, signature=SourmashSignature('NC_009665.1 Shewanella baltica OS185, complete genome', b47b13ef), location=None),\n",
" Result(score=0.22846441947565543, signature=SourmashSignature('NC_011663.1 Shewanella baltica OS223, complete genome', ae6659f6), location=None)]\n"
]
}
],
Expand All @@ -186,14 +180,12 @@
"name": "stdout",
"output_type": "stream",
"text": [
"[(1.0,\n",
" SourmashSignature('NC_011663.1 Shewanella baltica OS223, complete genome', ae6659f6),\n",
" None)]\n"
"Result(score=1.0, signature=SourmashSignature('NC_011663.1 Shewanella baltica OS223, complete genome', ae6659f6), location=None)\n"
]
}
],
"source": [
"pprint(db.gather(sig3))"
"pprint(db.best_containment(sig3))"
]
},
{
Expand Down Expand Up @@ -251,7 +243,7 @@
}
],
"source": [
"pprint(db.ident_to_idx.keys())"
"pprint(db._ident_to_idx.keys())"
]
},
{
Expand All @@ -271,7 +263,7 @@
}
],
"source": [
"pprint(db.ident_to_name)"
"pprint(db._ident_to_name)"
]
},
{
Expand All @@ -285,7 +277,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The attribute `hashval_to_idx` contains a mapping from individual hash values to sets of `idx` indices.\n",
"The attribute `_hashval_to_idx` contains a mapping from individual hash values to sets of `idx` indices.\n",
"\n",
"See the method `_find_signatures()` for an example of how this is used in `search` and `gather`."
]
Expand All @@ -304,7 +296,7 @@
}
],
"source": [
"print('{} hash values total in this database'.format(len(db.hashval_to_idx)))"
"print('{} hash values total in this database'.format(len(db._hashval_to_idx)))"
]
},
{
Expand All @@ -322,7 +314,7 @@
],
"source": [
"all_idx = set()\n",
"for idx_set in db.hashval_to_idx.values():\n",
"for idx_set in db._hashval_to_idx.values():\n",
" all_idx.update(idx_set)\n",
"print('belonging to signatures with idx {}'.format(all_idx))"
]
Expand All @@ -333,7 +325,7 @@
"metadata": {},
"outputs": [],
"source": [
"first_three_hashvals = list(db.hashval_to_idx)[:3]"
"first_three_hashvals = list(db._hashval_to_idx)[:3]"
]
},
{
Expand All @@ -353,7 +345,7 @@
],
"source": [
"for hashval in first_three_hashvals:\n",
" print('hashval {} belongs to idxs {}'.format(hashval, db.hashval_to_idx[hashval]))"
" print('hashval {} belongs to idxs {}'.format(hashval, db._hashval_to_idx[hashval]))"
]
},
{
Expand All @@ -374,16 +366,16 @@
"source": [
"query_idx = 2\n",
"hashval_set = set()\n",
"for hashval, idx_set in db.hashval_to_idx.items():\n",
"for hashval, idx_set in db._hashval_to_idx.items():\n",
" if query_idx in idx_set:\n",
" hashval_set.add(hashval)\n",
" \n",
"print('{} hashvals belong to query idx {}'.format(len(hashval_set), query_idx))\n",
"\n",
"ident = db.idx_to_ident[query_idx]\n",
"ident = db._idx_to_ident[query_idx]\n",
"print('query idx {} matches to ident {}'.format(query_idx, ident))\n",
"\n",
"name = db.ident_to_name[ident]\n",
"name = db._ident_to_name[ident]\n",
"print('query idx {} matches to name {}'.format(query_idx, name))"
]
},
Expand All @@ -400,7 +392,7 @@
"metadata": {},
"outputs": [],
"source": [
"from sourmash.lca import LineagePair"
"from sourmash.lca.lca_utils import LineagePair"
]
},
{
Expand Down Expand Up @@ -455,13 +447,13 @@
"source": [
"# by default, the identifier is the signature name --\n",
"ident = sig1.name\n",
"idx = db.ident_to_idx[ident]\n",
"idx = db._ident_to_idx[ident]\n",
"print(\"ident '{}' has idx {}\".format(ident, idx))\n",
"\n",
"lid = db.idx_to_lid[idx]\n",
"lid = db._idx_to_lid[idx]\n",
"print(\"lid for idx {} is {}\".format(idx, lid))\n",
"\n",
"lineage = db.lid_to_lineage[lid]\n",
"lineage = db._lid_to_lineage[lid]\n",
"display = sourmash.lca.display_lineage(lineage)\n",
"print(\"lineage for lid {} is {}\".format(lid, display))"
]
Expand Down Expand Up @@ -725,8 +717,8 @@
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/t/miniconda3/envs/py37/lib/python3.7/site-packages/ipykernel_launcher.py:1: DeprecatedWarning: get_mins is deprecated as of 3.5 and will be removed in 5.0. Use .hashes property instead.\n",
" \"\"\"Entry point for launching an IPython kernel.\n"
"/var/folders/6s/_f373w1d6hdfjc2kjstq97s80000gp/T/ipykernel_3384/490137846.py:1: DeprecatedWarning: get_mins is deprecated as of 3.5 and will be removed in 5.0. Use .hashes property instead.\n",
" assignments = sourmash.lca.gather_assignments(sig2.minhash.get_mins(), [db])\n"
]
}
],
Expand Down Expand Up @@ -834,9 +826,9 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python (myenv)",
"display_name": "smash-notebooks",
"language": "python",
"name": "myenv"
"name": "smash-notebooks"
},
"language_info": {
"codemirror_mode": {
Expand All @@ -848,7 +840,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
"version": "3.11.3"
}
},
"nbformat": 4,
Expand Down