Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit 43cf8a1

Browse files
committedMar 13, 2025·
Merge branch 'main' of https://github.com/lancedb/lance into add-avg-loss
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2 parents 52b27c7 + 15420d5 commit 43cf8a1

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

51 files changed

+3063
-1606
lines changed
 

‎Cargo.lock

+39-25
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

‎README.md

+13-8
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,13 @@
33

44
<img width="257" alt="Lance Logo" src="https://user-images.githubusercontent.com/917119/199353423-d3e202f7-0269-411d-8ff2-e747e419e492.png">
55

6-
**Modern columnar data format for ML. Convert from Parquet in 2-lines of code for 100x faster random access, a vector index, data versioning, and more.<br/>**
7-
**Compatible with pandas, DuckDB, Polars, and pyarrow with more integrations on the way.**
6+
**Modern columnar data format for ML. Convert from Parquet in 2-lines of code for 100x faster random access, zero-cost schema evolution, rich secondary indices, versioning, and more.<br/>**
7+
**Compatible with Pandas, DuckDB, Polars, Pyarrow, and Ray with more integrations on the way.**
88

99
<a href="https://lancedb.github.io/lance/">Documentation</a> •
1010
<a href="https://blog.lancedb.com/">Blog</a> •
1111
<a href="https://discord.gg/zMM32dvNtd">Discord</a> •
12-
<a href="https://twitter.com/lancedb">Twitter</a>
12+
<a href="https://x.com/lancedb">X</a>
1313

1414
[CI]: https://github.com/lancedb/lance/actions/workflows/rust.yml
1515
[CI Badge]: https://github.com/lancedb/lance/actions/workflows/rust.yml/badge.svg
@@ -44,7 +44,7 @@ The key features of Lance include:
4444

4545
* **Zero-copy, automatic versioning:** manage versions of your data without needing extra infrastructure.
4646

47-
* **Ecosystem integrations:** Apache Arrow, Pandas, Polars, DuckDB and more on the way.
47+
* **Ecosystem integrations:** Apache Arrow, Pandas, Polars, DuckDB, Ray, Spark and more on the way.
4848

4949
> [!TIP]
5050
> Lance is in active development and we welcome contributions. Please see our [contributing guide](docs/contributing.rst) for more information.
@@ -66,7 +66,7 @@ pip install --pre --extra-index-url https://pypi.fury.io/lancedb/ pylance
6666
> [!TIP]
6767
> Preview releases are released more often than full releases and contain the
6868
> latest features and bug fixes. They receive the same level of testing as full releases.
69-
> We guarantee they will remain published and available for download for at
69+
> We guarantee they will remain published and available for download for at
7070
> least 6 months. When you want to pin to a specific version, prefer a stable release.
7171
7272
**Converting to Lance**
@@ -186,8 +186,8 @@ Support both CPUs (``x86_64`` and ``arm``) and GPU (``Nvidia (cuda)`` and ``Appl
186186

187187
**Fast updates** (ROADMAP): Updates will be supported via write-ahead logs.
188188

189-
**Rich secondary indices** (ROADMAP):
190-
- Inverted index for fuzzy search over many label / annotation fields.
189+
**Rich secondary indices**: Support `BTree`, `Bitmap`, `Full text search`, `Label list`,
190+
`NGrams`, and more.
191191

192192
## Benchmarks
193193

@@ -253,11 +253,16 @@ A comparison of different data formats in each stage of ML development cycle.
253253

254254
Lance is currently used in production by:
255255
* [LanceDB](https://github.com/lancedb/lancedb), a serverless, low-latency vector database for ML applications
256+
* [LanceDB Enterprise](https://docs.lancedb.com/enterprise/introduction), hyperscale LanceDB with enterprise SLA.
257+
* Leading multimodal Gen AI companies for training over petabyte-scale multimodal data.
256258
* Self-driving car company for large-scale storage, retrieval and processing of multi-modal data.
257259
* E-commerce company for billion-scale+ vector personalized search.
258260
* and more.
259261

260-
## Presentations and Talks
262+
## Presentations, Blogs and Talks
261263

264+
* [Designing a Table Format for ML Workloads](https://blog.lancedb.com/designing-a-table-format-for-ml-workloads/), Feb 2025.
265+
* [Transforming Multimodal Data Management with LanceDB, Ray Summit](https://www.youtube.com/watch?v=xmTFEzAh8ho), Oct 2024.
266+
* [Lance v2: A columnar container format for modern data](https://blog.lancedb.com/lance-v2/), Apr 2024.
262267
* [Lance Deep Dive](https://drive.google.com/file/d/1Orh9rK0Mpj9zN_gnQF1eJJFpAc6lStGm/view?usp=drive_link). July 2023.
263268
* [Lance: A New Columnar Data Format](https://docs.google.com/presentation/d/1a4nAiQAkPDBtOfXFpPg7lbeDAxcNDVKgoUkw3cUs2rE/edit#slide=id.p), [Scipy 2022, Austin, TX](https://www.scipy2022.scipy.org/posters). July, 2022.

0 commit comments

Comments
 (0)
Please sign in to comment.