Skip to content

Releases: sourmash-bio/sourmash

v4.4.2

19 Jul 16:17
2054391
Compare
Choose a tag to compare

Minor fixes and performance improvements:

  • circumvent a very slow MinHash.remove_many(...) call in sourmash gather (#2123)

Developer updates:

  • substantial refactoring of CounterGather and related Index code. (#2116)
  • update Index protocol tests to include tests for peek and consume (#2111)
  • Bump pypa/cibuildwheel from 2.7.0 to 2.8.0 (#2118)
  • test insert after downsample for LCA_Database (#2117)
  • update release notes & pyproject.toml after v4.4.1 (#2114)

v4.4.1

09 Jul 12:55
@ctb ctb
bdd7b55
Compare
Choose a tag to compare

Major new features:

  • less stringent size accuracy parameters for ANI accuracy reporting (#2074)
  • only skip dist est if containment/jaccard are 0 or 1 (#2060)
  • emit fewer warnings about potential ANI estimation issues (#2061)

Minor new features:

  • fix lca summarize to support general collections for queries (#2107)
  • add compare --avg-containment (#2056)

Documentation updates:

  • fix search and gather docs (#2105)
  • fix CITATION.cff YAML and add a test for parseability and content. (#2103)

Developer updates:

  • move setup.cfg into pyproject.toml (#2097)
  • Fix downsample_scaled in core (#2108)
  • add picklist tests; support for allow_empty (#2106)
  • remove LazyLoadedIndex (#2104)
  • Bump web-sys from 0.3.57 to 0.3.58 (#2092)
  • Bump getrandom from 0.2.6 to 0.2.7 (#2090)
  • Bump wasm-bindgen-test from 0.3.30 to 0.3.31 (#2093)
  • Bump pypa/cibuildwheel from 2.6.1 to 2.7.0 (#2089)
  • Build: nix updates (#2088)
  • CI: split wheel building (#2087)
  • rust version bumps (#2086)
  • Update sphinx requirement from <5,>=4.4.0 to >=4.4.0,<6 (#2068)
  • Bump actions/setup-python from 3 to 4 (#2080)
  • Bump myst-parser from 0.17.2 to 0.18.0 (#2081)
  • Bump pypa/cibuildwheel from 2.5.0 to 2.6.1 (#2079)
  • remove unnecessary object from class definitions (#2077)

v4.4.0

13 May 16:33
@ctb ctb
851dc2b
Compare
Choose a tag to compare

This release contains many new features! Of particular note:

  • sourmash now estimates and outputs average nucleotide identity (ANI) based on k-mer measures;
  • sourmash sketch translate is no longer unusably slow;
  • we provide Mac OS 'arm64' wheels for the new M1 Macs;
  • we've added a number of support features for managing large collections of signatures and building very large databases;
  • and we've added support for SQLite databases that can be used for storing and searching signatures and doing Kraken-style LCA analysis of genomes and metagenomes.

In addition, we have built updated Genbank genome databases (with contents from March 2022) as well as GTDB R07-RS207 databases; see the prepared databases page. We've also made some benchmarks available for these databases, so you can get some idea of the necessary computational resources for your searches.

Last but by no means least, we have begun providing a number of examples and recipes for using sourmash - see the new sourmash examples Web site!


Major new features:

  • add ANI output to search, prefetch, and gather (#1934, #1952, #1955, #1966, #1967, #2011, #2031, #2032)
  • new GTDB and Genbank database releases (#2013, #2038)
  • provide macos arm64 wheels (#1935)
  • support for SQLite databases (#1808)
  • implement sourmash sketch fromfile (#1884, #1885, #1886, #2009)
  • add sourmash sig check for comparing picklists and databases (#1907, #1915, #1917)
  • add sig collect command (#2036) for building standalone manifests from many databases
  • Add direct loading of manifest CSVs as sourmash indices (#1891)
  • add -A/--abundance-from to sig subtract & add sig inflate (#1889)
  • advanced database format documentation (#2025)

Minor new features:

  • add -d/--debug to sourmash sig describe; upgrade output errors. (#1782)
  • add sum_hashes to sourmash sig describe output. (#1882)

Bug fixes:

  • catch TypeError in search w/abund vs flat at the command line (#1928)
  • speed up SeqToHashes translate (#1938, #1946)

Cleanup and documentation fixes:

  • better handle some pickfile errors (#1924)
  • remove unnecessary downsampling warnings (#1971)
  • use same wording for dayhoff/hp as for dna/protein (#1929)
  • rename covered_bp property to better reflect function (#2050)

Developer updates:

v4.3.0

11 Mar 19:28
Compare
Choose a tag to compare

New features:

  • add sourmash sig grep (#1864)
  • add sourmash sig summarize (#1837, #1863)
  • add --include-db-pattern and --exclude-db-pattern to many commands (#1871)
  • update lca summarize output to output total counts (#1838)

Bug fixes:

  • fix sourmash prefetch to work when db scaled is larger than query scaled (#1870)
  • fix sourmash prefetch for multiple ksizes in database (#1866)
  • allow missing columns in tax CSV files (#1869)
  • fix containment calculation for nodegraphs (#1862)
  • fix tax prepare SQL code for empty/blank taxonomic ranks (#1843)

Cleanup and documentation fixes:

  • clean up 'describe' a little bit, add a test (#1861)
  • add --output-dir as alias for every --outdir (#1817)
  • fix doc titles in command-line.md and update description a bit (#1874)

Developer updates:

  • move greyhound-core into sourmash (#1238)
  • drop Python 3.7, default most of CI to Python 3.10 (#1839)
  • reorganize traits for easier wasm and native compilation (#1836)
  • update asv to newly released version (#1834)
  • pin setuptools < 60 (#1879)

v4.2.4

05 Feb 05:19
e3804f2
Compare
Choose a tag to compare

Medium bug fixes:

  • fix bug where sourmash sketch ... --singleton -o output.sig drops signatures (#1810)
  • fix sourmash search --containment with two abund signatures (#1780)
  • fix plot/labels/CSV ordering with sourmash plot --csv (#1821)

Small bug fixes:

  • fix Index.search_abund downsampling and filename output (#1820)
  • check to make sure that .zip files exist before trying to load from them (#1777)
  • fix and test and refactor output information during signature creation (#1826)

Minor new functionality:

  • adjust text output of gather to indicate weighted/unweighted results (#1819)
  • update sourmash multigather to save hash abundances to .unassigned.sig (#1720)
  • re-inflate prefetch output sketches (#1827)

Cleanup and documentation fixes:

  • fix 'sketch' output info (#1794)
  • fix PMID for mock metagenome (#1811)
  • check to make sure that = is in param strings where necessary (#1775)

Developer updates:

  • set pickfile on SourmashPicklist.load (#1776)
  • Fix new clippy lints in beta (1.59, next stable) (#1791)
  • Rust updates (clippy, MSRV, CI, wasm-pack) (#1786)
  • disable the fix_lint component of the py38 tests in tox.ini (#1823)

v4.2.3

20 Dec 16:30
@ctb ctb
73aeb15
Compare
Choose a tag to compare

Minor new features:

  • Save prefetch csv directly from prefetch-gather with --save-prefetch-csv (#1765)
  • Added brief descriptions and -h/--help text to sourmash gather, search, and compare (#1735)
  • Adding bounds checking for --scaled and --num in sourmash sketch (#1711)

Documentation updates:

  • update release notes with -m for git tag (#1754)
  • update coverage from 10x to 20x per description in documentation page (#1736)

Development updates:

  • Update tests to use runtmp fixture instead of utils.TempDirectory() (#1718)
  • Refactor ZipFileLinearCollection and SaveSignatures_ZipFile to use ZipStorage (#1598)
  • Clippy fixes for 1.57 beta (#1760)
  • CI: Update cibuildwheel usage (#1759)
  • Replace notify format usage with f-strings instead (#1723)
  • CI: Fix build errors with cbindgen (#1713)
  • Change sourmash compute to sourmash sketch in test files (#1712)
  • Update tests to use runtmp fixture instead of utils.TempDirectory()

v4.2.2

11 Aug 19:32
f7f2e83
Compare
Choose a tag to compare

Major new features:

  • added functionality to recover original k-mers given hashes - sourmash sig kmers et al. (#1653, #1695, #1701)

Documentation updates:

  • Updated picklist docs (#1683)
  • Updated the 'how to release' doc after 4.2.0 release (#1649)

Minor new features:

  • Adjusted dayhoff and hp encodings to tolerate stop codons in the protein sequence (#1673)

Bug fixes and performance improvements:

  • Fixed panic bug in sourmash sketch dna with bad input and --check-sequence (#1702)

Refactoring and cleanup:

  • Changed sourmash compute to sourmash sketch in tests/test_sourmash.py (#1680, #1687)
  • Tested and fixed sourmash_args.load_many_signatures(...) and lca_db.load_single_database (#1684)

v4.2.1

16 Jul 23:24
96920c7
Compare
Choose a tag to compare

This is a bug-fix and performance release of sourmash.

There are no major new features.

git log --oneline v4.2.0..latest

Minor new features:

  • new picklist coltypes for directly using gather, prefetch, and manifest outputs without specifying column name (#1660)
  • add --from-file to sig cat (#1657)
  • implement a lazy/on-demand Index loading class to support low memory tracking of a large index (#1661)
  • add sourmash tax prepare to build SQLite taxonomy databases for use with tax commands(#1651)
  • Support manifests in MultiIndex (#1654)
  • tax summarization additions and fixes, including reporting bp and unclassified (#1667)
  • add --from-file, improved sig selection to most sig commands (#1672)

Bug fixes and performance improvements:

  • fix bug in gather when run with scaled=1 (#1670)

Documentation updates:

  • Add sourmash-bio/community Gitter badge to README (#1658)

Refactoring and cleanup:

  • add tests for sourmash tax --containment-threshold arg (#1666)
  • fix sourmash tax usage string (#1655)
  • add bounds checking for --scaled (#1650)

Rust interface:

  • Rust Core update (tag: r0.11.0) (#1643)

v4.2.0

01 Jul 13:21
@ctb ctb
21f5e63
Compare
Choose a tag to compare

This release adds several significant features: first, we've added a set of taxonomy command-line functionality for combining sourmash gather output with taxonomy databases, and we've also added a new "picklist" feature that enables flexible selection of subsets of databases. Finally, we've added manifests to databases to support picklists as well as faster database loading and signature selection.

As of this release, we've also formally moved development over to the sourmash-bio organization on GitHub, and we've created a new gitter support channel, sourmash-bio/community. Please join us there if you have any questions, comments, or feature requests!

Major new features:

Documentation updates:

  • Add new GTDB databases description to docs and start legacy databases page (#1581)
  • Change dib-lab/ URLs to new sourmash-bio/ URLs. (#1629)
  • Add notice for sustainable open source study (#1580)

Minor new features:

  • alias --nucleotide, --no-nucleotide for moltype args. (#1632)
  • add signature names to known/unknown hash sigs output by sourmash prefetch (#1646)

Bug fixes and performance improvements:

  • Speed up sourmash gather with prefetch by ignoring unidentifiable hashes (#1613)
  • Check for MinHash compatibility in MinHash.intersection_and_union(...) (#1627)
  • Fix selection w/abund and manifest column type conversions (#1645)

Refactoring and cleanup:

  • Fix Rust 1.59 lints (#1600)
  • Minor cleanup in sourmash_args & sig submodules (#1586)
  • Minor cleanup in minhash module (#1585)
  • Fix needless borrows as suggested by clippy (#1636)

v4.1.2

07 Jun 21:27
@ctb ctb
6b5806c
Compare
Choose a tag to compare

This is a bug-fix and performance release of sourmash.

There are no major new features.

Minor new features:

  • add query info to gather CSV output (#1565)

Bug fixes and performance improvements:

  • Improved MinHash.remove_many(...) performance by five orders of magnitude (#1571)
  • Fix SBT index saving bug that arbitrarily replaced names (but not content) of identical signatures in .sbt.zip files (#1568)
  • Empty zipfiles should not cause AssertionError (#1546)

Major refactoring and new internal functionality:

  • update MinHash.set_abundances to remove hash if 0 abund; handle negative abundances (#1575)

Refactoring and cleanup:

  • Fix tests that fail to close files that they open (#1550)
  • Add "&" and " | " as alternate syntax for MinHash intersection merge (#1533)
  • Fix missing bracket in docs (#1566)
  • Updates for coverage tracking (#1558)
  • Provide a .copy() method for both SourmashSignature() and MinHash (#1551, #1570)