Skip to content

Commit

Permalink
MRG: fix tutorial notebook links (#2633)
Browse files Browse the repository at this point in the history
Fixes #2604

This PR fixes links to binder tutorials that apparently have been broken
since 2020 😱 , when the docs were switched over to myst-parser in
#1021.

To be fair, it's the kind of subtle bug that would leave users
scratching their head going "maybe it's me?" - the links simply went
back to the tutorial page... not sure why they weren't flagged by sphinx
or myst-parser as being bad!?

This PR also updates environment.yml to properly build:
* includes pip
* specifies minimum python version
* specifies minimum sourmash version

ref #2503 for why
specifying the minimum sourmash and/or python versions can be important!

And, finally, I ran and fixed all the notebooks...

Note: I confirmed that this branch launches properly in binder. The URL
is: https://mybinder.org/v2/gh/sourmash-bio/sourmash/fix/tutorial-nb

---------

Co-authored-by: ccbaumler <63077899+ccbaumler@users.noreply.github.com>
  • Loading branch information
ctb and ccbaumler authored Jun 22, 2023
1 parent c81d7ea commit 81a4a30
Show file tree
Hide file tree
Showing 8 changed files with 133 additions and 101 deletions.
4 changes: 3 additions & 1 deletion binder/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,12 @@ channels:
- bioconda
- defaults
dependencies:
- sourmash
- python>=3.9
- sourmash>=4.8.2
- screed
- matplotlib
- pandas
- pip
- pip:
- matplotlib_venn
- mmh3
2 changes: 1 addition & 1 deletion doc/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ be stored, searched, explored, and taxonomically annotated.
You can take a look at sourmash analyses on real data
[in a saved Jupyter notebook](https://github.com/sourmash-bio/sourmash/blob/latest/doc/sourmash-examples.ipynb),
and experiment with it yourself
[interactively in a Jupyter Notebook](https://mybinder.org/v2/gh/sourmash-bio/sourmash/latest?filepath=doc%2Fsourmash-examples.ipynb)
[interactively in a Jupyter Notebook](https://mybinder.org/v2/gh/sourmash-bio/sourmash/latest?labpath=doc%2Fsourmash-examples.ipynb)
at [mybinder.org](http://mybinder.org).

## Installing sourmash
Expand Down
28 changes: 14 additions & 14 deletions doc/kmers-and-minhash.ipynb

Large diffs are not rendered by default.

114 changes: 76 additions & 38 deletions doc/plotting-compare.ipynb

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion doc/sourmash-collections.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"### Running this notebook.\n",
"\n",
"You can run this notebook interactively via mybinder; click on this button:\n",
"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dib-lab/sourmash/latest?filepath=doc%2Fsourmash-collections.ipynb)\n",
"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dib-lab/sourmash/latest?labpath=doc%2Fsourmash-collections.ipynb)\n",
"\n",
"A rendered version of this notebook is available at [sourmash.readthedocs.io](https://sourmash.readthedocs.io) under \"Tutorials and notebooks\".\n",
"\n",
Expand Down
6 changes: 3 additions & 3 deletions doc/sourmash-examples.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"### Running this notebook.\n",
"\n",
"You can run this notebook interactively via mybinder; click on this button:\n",
"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dib-lab/sourmash/latest?filepath=doc%2Fsourmash-examples.ipynb)\n",
"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dib-lab/sourmash/latest?labpath=doc%2Fsourmash-examples.ipynb)\n",
"\n",
"A rendered version of this notebook is available at [sourmash.readthedocs.io](https://sourmash.readthedocs.io) under \"Tutorials and notebooks\".\n",
"\n",
Expand Down Expand Up @@ -510,9 +510,9 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python (myenv)",
"display_name": "smash-notebooks",
"language": "python",
"name": "myenv"
"name": "smash-notebooks"
},
"language_info": {
"codemirror_mode": {
Expand Down
16 changes: 8 additions & 8 deletions doc/tutorials.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,22 +13,22 @@ X and Linux. They require about 5 GB of disk space and 5 GB of RAM.

## Background and details

These next four tutorials are all notebooks that you can view, run
These next three tutorials are all notebooks that you can view, run
yourself, or run interactively online via the
[binder](https://mybinder.org) service.

* [An introduction to k-mers for genome comparison and analysis](kmers-and-minhash.md)
* [An introduction to k-mers for genome comparison and analysis.](kmers-and-minhash.ipynb)

* [Some sourmash command line examples!](sourmash-examples.md)
* [Some sourmash command line examples!](sourmash-examples.ipynb)

* [Working with private collections of signatures.](sourmash-collections.md)
* [Working with private collections of signatures.](sourmash-collections.ipynb)

* [Using `sourmash taxonomy` with the LIN taxonomic framework.](tutorial-lin-taxonomy.md)

## More information
## Advanced tutorials and more information

For more information on analyzing sequencing data with sourmash, check out our [longer tutorial](tutorial-long.md).

Read [using `sourmash taxonomy` with the Life Identification Number (LIN) taxonomic framework](tutorial-lin-taxonomy.md) for some of our newer taxonomic features.

If you are a Python programmer, you might also be interested in our [API examples](api-example.md) as well as a short guide to [Using the `LCA_Database` API.](using-LCA-database-API.ipynb)

If you prefer R, we have [a short guide to using sourmash output with R](other-languages.md).
Expand All @@ -37,7 +37,7 @@ If you prefer R, we have [a short guide to using sourmash output with R](other-l

If you're interested in customizing the output of `sourmash plot`,
which produces comparison matrices and dendrograms, please see
[Building plots from `sourmash compare` output](plotting-compare.md).
[Building plots from `sourmash compare` output](plotting-compare.ipynb).

## Contents:

Expand Down
62 changes: 27 additions & 35 deletions doc/using-LCA-database-API.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
"### Running this notebook.\n",
"\n",
"You can run this notebook interactively via mybinder; click on this button:\n",
"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dib-lab/sourmash/latest?filepath=doc%2Fusing-LCA-database-API.ipynb)\n",
"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dib-lab/sourmash/latest?labpath=doc%2Fusing-LCA-database-API.ipynb)\n",
"\n",
"A rendered version of this notebook is available at [sourmash.readthedocs.io](https://sourmash.readthedocs.io) under \"Tutorials and notebooks\".\n",
"\n",
Expand Down Expand Up @@ -74,14 +74,14 @@
"text": [
"\r",
"\u001b[K\r\n",
"== This is sourmash version 4.0.0a4.dev12+g31c5eda2. ==\r\n",
"== This is sourmash version 4.8.2. ==\r\n",
"\r",
"\u001b[K== Please cite Brown and Irber (2016), doi:10.21105/joss.00027. ==\r\n",
"\r\n",
"\r",
"\u001b[Kcomputing signatures for files: genomes/akkermansia.fa, genomes/shew_os185.fa, genomes/shew_os223.fa\r\n",
"\r",
"\u001b[KComputing a total of 1 signature(s).\r\n",
"\u001b[KComputing a total of 1 signature(s) for each input.\r\n",
"\r",
"\u001b[Kskipping genomes/akkermansia.fa - already done\r\n",
"\r",
Expand Down Expand Up @@ -144,9 +144,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"[(1.0,\n",
" SourmashSignature('CP001071.1 Akkermansia muciniphila ATCC BAA-835, complete genome', 6822e0b7),\n",
" None)]\n"
"[Result(score=1.0, signature=SourmashSignature('CP001071.1 Akkermansia muciniphila ATCC BAA-835, complete genome', 6822e0b7), location=None)]\n"
]
}
],
Expand All @@ -164,12 +162,8 @@
"name": "stdout",
"output_type": "stream",
"text": [
"[(1.0,\n",
" SourmashSignature('NC_009665.1 Shewanella baltica OS185, complete genome', b47b13ef),\n",
" None),\n",
" (0.22846441947565543,\n",
" SourmashSignature('NC_011663.1 Shewanella baltica OS223, complete genome', ae6659f6),\n",
" None)]\n"
"[Result(score=1.0, signature=SourmashSignature('NC_009665.1 Shewanella baltica OS185, complete genome', b47b13ef), location=None),\n",
" Result(score=0.22846441947565543, signature=SourmashSignature('NC_011663.1 Shewanella baltica OS223, complete genome', ae6659f6), location=None)]\n"
]
}
],
Expand All @@ -186,14 +180,12 @@
"name": "stdout",
"output_type": "stream",
"text": [
"[(1.0,\n",
" SourmashSignature('NC_011663.1 Shewanella baltica OS223, complete genome', ae6659f6),\n",
" None)]\n"
"Result(score=1.0, signature=SourmashSignature('NC_011663.1 Shewanella baltica OS223, complete genome', ae6659f6), location=None)\n"
]
}
],
"source": [
"pprint(db.gather(sig3))"
"pprint(db.best_containment(sig3))"
]
},
{
Expand Down Expand Up @@ -251,7 +243,7 @@
}
],
"source": [
"pprint(db.ident_to_idx.keys())"
"pprint(db._ident_to_idx.keys())"
]
},
{
Expand All @@ -271,7 +263,7 @@
}
],
"source": [
"pprint(db.ident_to_name)"
"pprint(db._ident_to_name)"
]
},
{
Expand All @@ -285,7 +277,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The attribute `hashval_to_idx` contains a mapping from individual hash values to sets of `idx` indices.\n",
"The attribute `_hashval_to_idx` contains a mapping from individual hash values to sets of `idx` indices.\n",
"\n",
"See the method `_find_signatures()` for an example of how this is used in `search` and `gather`."
]
Expand All @@ -304,7 +296,7 @@
}
],
"source": [
"print('{} hash values total in this database'.format(len(db.hashval_to_idx)))"
"print('{} hash values total in this database'.format(len(db._hashval_to_idx)))"
]
},
{
Expand All @@ -322,7 +314,7 @@
],
"source": [
"all_idx = set()\n",
"for idx_set in db.hashval_to_idx.values():\n",
"for idx_set in db._hashval_to_idx.values():\n",
" all_idx.update(idx_set)\n",
"print('belonging to signatures with idx {}'.format(all_idx))"
]
Expand All @@ -333,7 +325,7 @@
"metadata": {},
"outputs": [],
"source": [
"first_three_hashvals = list(db.hashval_to_idx)[:3]"
"first_three_hashvals = list(db._hashval_to_idx)[:3]"
]
},
{
Expand All @@ -353,7 +345,7 @@
],
"source": [
"for hashval in first_three_hashvals:\n",
" print('hashval {} belongs to idxs {}'.format(hashval, db.hashval_to_idx[hashval]))"
" print('hashval {} belongs to idxs {}'.format(hashval, db._hashval_to_idx[hashval]))"
]
},
{
Expand All @@ -374,16 +366,16 @@
"source": [
"query_idx = 2\n",
"hashval_set = set()\n",
"for hashval, idx_set in db.hashval_to_idx.items():\n",
"for hashval, idx_set in db._hashval_to_idx.items():\n",
" if query_idx in idx_set:\n",
" hashval_set.add(hashval)\n",
" \n",
"print('{} hashvals belong to query idx {}'.format(len(hashval_set), query_idx))\n",
"\n",
"ident = db.idx_to_ident[query_idx]\n",
"ident = db._idx_to_ident[query_idx]\n",
"print('query idx {} matches to ident {}'.format(query_idx, ident))\n",
"\n",
"name = db.ident_to_name[ident]\n",
"name = db._ident_to_name[ident]\n",
"print('query idx {} matches to name {}'.format(query_idx, name))"
]
},
Expand All @@ -400,7 +392,7 @@
"metadata": {},
"outputs": [],
"source": [
"from sourmash.lca import LineagePair"
"from sourmash.lca.lca_utils import LineagePair"
]
},
{
Expand Down Expand Up @@ -455,13 +447,13 @@
"source": [
"# by default, the identifier is the signature name --\n",
"ident = sig1.name\n",
"idx = db.ident_to_idx[ident]\n",
"idx = db._ident_to_idx[ident]\n",
"print(\"ident '{}' has idx {}\".format(ident, idx))\n",
"\n",
"lid = db.idx_to_lid[idx]\n",
"lid = db._idx_to_lid[idx]\n",
"print(\"lid for idx {} is {}\".format(idx, lid))\n",
"\n",
"lineage = db.lid_to_lineage[lid]\n",
"lineage = db._lid_to_lineage[lid]\n",
"display = sourmash.lca.display_lineage(lineage)\n",
"print(\"lineage for lid {} is {}\".format(lid, display))"
]
Expand Down Expand Up @@ -725,8 +717,8 @@
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/t/miniconda3/envs/py37/lib/python3.7/site-packages/ipykernel_launcher.py:1: DeprecatedWarning: get_mins is deprecated as of 3.5 and will be removed in 5.0. Use .hashes property instead.\n",
" \"\"\"Entry point for launching an IPython kernel.\n"
"/var/folders/6s/_f373w1d6hdfjc2kjstq97s80000gp/T/ipykernel_3384/490137846.py:1: DeprecatedWarning: get_mins is deprecated as of 3.5 and will be removed in 5.0. Use .hashes property instead.\n",
" assignments = sourmash.lca.gather_assignments(sig2.minhash.get_mins(), [db])\n"
]
}
],
Expand Down Expand Up @@ -834,9 +826,9 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python (myenv)",
"display_name": "smash-notebooks",
"language": "python",
"name": "myenv"
"name": "smash-notebooks"
},
"language_info": {
"codemirror_mode": {
Expand All @@ -848,7 +840,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
"version": "3.11.3"
}
},
"nbformat": 4,
Expand Down

0 comments on commit 81a4a30

Please sign in to comment.