v0.16.0
What's Changed
Breaking Changes 🛠
- feat!: simplify take row api by @eddyxu in #2664
- feat!: distinguishable scalar index types by @eddyxu in #2671
New Features 🎉
- feat: standalone vector transform stage by @westonpace in #2566
- feat: coalesce scheduling of reads to speed up random access by @raunaks13 in #2636
- feat: add version tags by @dsgibbons in #2482
- feat: object store registry for custom object store providers by @maxburke in #2513
- feat: merge_insert update subcolumns by @wjones127 in #2639
- feat: implement disk-based inverted index by @BubbleCal in #2643
- feat: allow round-tripping of dictionary data through the v2 format by @westonpace in #2656
- feat: support transforming selected fragments in vector transform stage for ivf_pq index by @raunaks13 in #2657
- feat: expand tags api by @dsgibbons in #2679
- feat: add standalone shuffle for transformed ivf-pq vectors file by @raunaks13 in #2670
- feat: support bitpacking for signed types by @albertlockett in #2662
- feat: support loading huggingface image dataset and convert image to PIL by @eddyxu in #2684
- feat: return BM25 scores for FTS by @BubbleCal in #2687
- feat: add data file format / version information to manifest by @westonpace in #2673
- feat: add support for the null data type to v2 by @westonpace in #2685
- feat: add backpressure to v2 I/O scheduler by @westonpace in #2683
Bug Fixes 🐛
- fix: correctly encode a list type when all items are empty by @westonpace in #2653
- fix: improve error message when can't train PQ on too small dataset by @albertlockett in #2644
- fix: slight cleanups to path handling so that the indices builder tool properly supports Windows by @westonpace in #2689
Documentation 📚
- docs: schema evolution by @wjones127 in #1911
- docs: reorg the scalar index python docstring to make the index type clear by @eddyxu in #2678
Performance Improvements 🚀
- perf: add v2 fragment file metadata to the FileMetadataCache by @jiachengdb in #2647
- perf: add random take benchmark by @chebbyChefNEQ in #2654
- perf: benchmark lance vs parquet read time, write time, and compressed size by @raunaks13 in #2383
Other Changes
- refactor: new buffer abstractions in decoders by @westonpace in #2648
New Contributors
- @dsgibbons made their first contribution in #2482
- @maxburke made their first contribution in #2513
Full Changelog: v0.15.0...v0.16.0