Skip to content

Commit b66f34e

Browse files
eddyxuwjones127
andauthored
docs: add example of Dataset.insert (#3534)
* Generate API docs automatically from plugin * Add example of `LanceDataset.insert()` and `write_dataset` 5/N of #2423 --------- Co-authored-by: Will Jones <willjones127@gmail.com>
1 parent 422c38d commit b66f34e

File tree

3 files changed

+56
-17
lines changed

3 files changed

+56
-17
lines changed

docs/api/api.rst

+3-2
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ APIs
22
----
33

44
.. toctree::
5+
:maxdepth: 1
56

6-
Rust <https://docs.rs/crate/lance/latest>
7-
Python <./python.rst>
7+
Rust <https://docs.rs/crate/lance/latest>
8+
Python <./python.rst>

docs/conf.py

+15-14
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,5 @@
11
# Configuration file for the Sphinx documentation builder.
22

3-
import shutil
4-
5-
6-
def run_apidoc(_):
7-
from sphinx.ext.apidoc import main
8-
9-
shutil.rmtree("api/python", ignore_errors=True)
10-
main(["-f", "-o", "api/python", "../python/python/lance"])
11-
12-
13-
def setup(app):
14-
app.connect("builder-inited", run_apidoc)
15-
163

174
# -- Project information -----------------------------------------------------
185

@@ -29,6 +16,7 @@ def setup(app):
2916
extensions = [
3017
"breathe",
3118
"sphinx_immaterial",
19+
"sphinx_immaterial.apidoc.python.apigen",
3220
"sphinx.ext.autodoc",
3321
"sphinx.ext.doctest",
3422
"sphinx.ext.githubpages",
@@ -58,6 +46,19 @@ def setup(app):
5846
"ray": ("https://docs.ray.io/en/latest/", None),
5947
}
6048

49+
python_apigen_modules = {
50+
"lance": "api/python/",
51+
}
52+
object_description_options = [
53+
(
54+
"py:.*",
55+
dict(
56+
include_object_type_in_xref_tooltip=False,
57+
include_in_toc=False,
58+
include_fields_in_toc=False,
59+
),
60+
),
61+
]
6162

6263
# -- Options for HTML output -------------------------------------------------
6364

@@ -96,7 +97,7 @@ def setup(app):
9697
},
9798
],
9899
}
99-
include_in_toc = False
100+
100101

101102
# -- doctest configuration ---------------------------------------------------
102103

docs/introduction/read_and_write.rst

+38-1
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,43 @@ You will need to provide a :py:class:`pyarrow.Schema` for the dataset in this ca
5050
:py:meth:`lance.write_dataset` supports writing :py:class:`pyarrow.Table`, :py:class:`pandas.DataFrame`,
5151
:py:class:`pyarrow.dataset.Dataset`, and ``Iterator[pyarrow.RecordBatch]``.
5252

53+
Adding Rows
54+
-----------
55+
56+
To insert data into your dataset, you can use either :py:meth:`LanceDataset.insert <lance.LanceDataset.insert>`
57+
or :py:meth:`~lance.write_dataset` with ``mode=append``.
58+
59+
.. testsetup::
60+
61+
shutil.rmtree("./insert_example.lance", ignore_errors=True)
62+
63+
.. doctest::
64+
65+
>>> import lance
66+
>>> import pyarrow as pa
67+
68+
>>> table = pa.Table.from_pylist([{"name": "Alice", "age": 20},
69+
... {"name": "Bob", "age": 30}])
70+
>>> ds = lance.write_dataset(table, "./insert_example.lance")
71+
72+
>>> new_table = pa.Table.from_pylist([{"name": "Carla", "age": 37}])
73+
>>> ds.insert(new_table)
74+
>>> ds.to_table().to_pandas()
75+
name age
76+
0 Alice 20
77+
1 Bob 30
78+
2 Carla 37
79+
80+
>>> new_table2 = pa.Table.from_pylist([{"name": "David", "age": 42}])
81+
>>> ds = lance.write_dataset(new_table2, ds, mode="append")
82+
>>> ds.to_table().to_pandas()
83+
name age
84+
0 Alice 20
85+
1 Bob 30
86+
2 Carla 37
87+
3 David 42
88+
89+
5390
Deleting rows
5491
-------------
5592

@@ -123,7 +160,7 @@ more efficient to use the merge insert operation described below.
123160
dataset.update({"age": new_age}, where=f"name='{name}'")
124161
125162
Merge Insert
126-
~~~~~~~~~~~~
163+
------------
127164

128165
Lance supports a merge insert operation. This can be used to add new data in bulk
129166
while also (potentially) matching against existing data. This operation can be used

0 commit comments

Comments
 (0)