v4.2.0
This release adds several significant features: first, we've added a set of taxonomy
command-line functionality for combining sourmash gather
output with taxonomy databases, and we've also added a new "picklist" feature that enables flexible selection of subsets of databases. Finally, we've added manifests to databases to support picklists as well as faster database loading and signature selection.
As of this release, we've also formally moved development over to the sourmash-bio organization on GitHub, and we've created a new gitter support channel, sourmash-bio/community. Please join us there if you have any questions, comments, or feature requests!
Major new features:
- add
tax/taxonomy
submodule (#1543, #1628, #1630, #1648) - add picklists for subsetting databases and results (#1587, #1588, #1623, #1590, #1639)
- Add manifests to support fast
Index.select(...)
and lazy loading (#1590)
Documentation updates:
- Add new GTDB databases description to docs and start legacy databases page (#1581)
- Change
dib-lab/
URLs to newsourmash-bio/
URLs. (#1629) - Add notice for sustainable open source study (#1580)
Minor new features:
- alias
--nucleotide
,--no-nucleotide
for moltype args. (#1632) - add signature names to known/unknown hash sigs output by
sourmash prefetch
(#1646)
Bug fixes and performance improvements:
- Speed up
sourmash gather
with prefetch by ignoring unidentifiable hashes (#1613) - Check for
MinHash
compatibility inMinHash.intersection_and_union(...)
(#1627) - Fix selection w/abund and manifest column type conversions (#1645)
Refactoring and cleanup: