diff --git a/docs/source/about.md b/docs/source/about.md index 8c71b05e..935f9240 100644 --- a/docs/source/about.md +++ b/docs/source/about.md @@ -22,7 +22,7 @@ for working with OGR vector data sources. It is **awesome**, has highly-dedicate maintainers and contributors, and exposes more functionality than Pyogrio ever will. This project would not be possible without Fiona having come first. -Pyogrio uses a vectorized (array-oriented) approach for reading and writing +Pyogrio uses a bulk-oriented approach for reading and writing spatial vector file formats, which enables faster I/O operations. It borrows from the internal mechanics and lessons learned of Fiona. It uses a stateless approach to reading or writing data; all data are read or written in a single diff --git a/docs/source/index.md b/docs/source/index.md index 02a81af2..bc008d6f 100644 --- a/docs/source/index.md +++ b/docs/source/index.md @@ -1,32 +1,25 @@ -# pyogrio - Vectorized spatial vector file format I/O using GDAL/OGR - -Pyogrio provides a -[GeoPandas](https://github.com/geopandas/geopandas)-oriented API to OGR vector -data sources, such as ESRI Shapefile, GeoPackage, and GeoJSON. Vector data sources -have geometries, such as points, lines, or polygons, and associated records -with potentially many columns worth of data. - -Pyogrio uses a vectorized approach for reading and writing GeoDataFrames to and -from OGR vector data sources in order to give you faster interoperability. It -uses pre-compiled bindings for GDAL/OGR so that the performance is primarily -limited by the underlying I/O speed of data source drivers in GDAL/OGR rather -than multiple steps of converting to and from Python data types within Python. +# pyogrio - bulk-oriented spatial vector file I/O using GDAL/OGR + +Pyogrio provides fast, bulk-oriented read and write access to +[GDAL/OGR](https://gdal.org/en/latest/drivers/vector/index.html) vector data +sources, such as ESRI Shapefile, GeoPackage, GeoJSON, and several others. +Vector data sources typically have geometries, such as points, lines, or +polygons, and associated records with potentially many columns worth of data. + +The typical use is to read or write these data sources to/from +[GeoPandas](https://github.com/geopandas/geopandas) `GeoDataFrames`. Because +the geometry column is optional, reading or writing only non-spatial data is +also possible. Hence, GeoPackage attribute tables, DBF files, or CSV files are +also supported. + +Pyogrio is fast because it uses pre-compiled bindings for GDAL/OGR to read and +write the data records in bulk. This approach avoids multiple steps of +converting to and from Python data types within Python, so performance becomes +primarily limited by the underlying I/O speed of data source drivers in +GDAL/OGR. We have seen \>5-10x speedups reading files and \>5-20x speedups writing files -compared to using non-vectorized approaches (Fiona and current I/O support in -GeoPandas). - -You can read these data sources into -`GeoDataFrames`, read just the non-geometry columns into Pandas `DataFrames`, -or even read non-spatial data sources that exist alongside vector data sources, -such as tables in a ESRI File Geodatabase, or antiquated DBF files. - -Pyogrio also enables you to write `GeoDataFrames` to at least a few different -OGR vector data source formats. - -```{warning} -Pyogrio is still at an early version and the API is subject to substantial change. -``` +compared to using row-per-row approaches (e.g. Fiona). ```{toctree} ---