diff --git a/README.md b/README.md index 8b9f8868..9e5e9708 100644 --- a/README.md +++ b/README.md @@ -1,35 +1,29 @@ -# pyogrio - Vectorized spatial vector file format I/O using GDAL/OGR - -Pyogrio provides a -[GeoPandas](https://github.com/geopandas/geopandas)-oriented API to OGR vector -data sources, such as ESRI Shapefile, GeoPackage, and GeoJSON. Vector data sources -have geometries, such as points, lines, or polygons, and associated records -with potentially many columns worth of data. - -Pyogrio uses a vectorized approach for reading and writing GeoDataFrames to and -from OGR vector data sources in order to give you faster interoperability. It -uses pre-compiled bindings for GDAL/OGR so that the performance is primarily -limited by the underlying I/O speed of data source drivers in GDAL/OGR rather -than multiple steps of converting to and from Python data types within Python. +# pyogrio - bulk-oriented spatial vector file I/O using GDAL/OGR + +Pyogrio provides fast, bulk-oriented read and write access to +[GDAL/OGR](https://gdal.org/en/latest/drivers/vector/index.html) vector data +sources, such as ESRI Shapefile, GeoPackage, GeoJSON, and several others. +Vector data sources typically have geometries, such as points, lines, or +polygons, and associated records with potentially many columns worth of data. + +The typical use is to read or write these data sources to/from +[GeoPandas](https://github.com/geopandas/geopandas) `GeoDataFrames`. Because +the geometry column is optional, reading or writing only non-spatial data is +also possible. Hence, GeoPackage attribute tables, DBF files, or CSV files are +also supported. + +Pyogrio is fast because it uses pre-compiled bindings for GDAL/OGR to read and +write the data records in bulk. This approach avoids multiple steps of +converting to and from Python data types within Python, so performance becomes +primarily limited by the underlying I/O speed of data source drivers in +GDAL/OGR. We have seen \>5-10x speedups reading files and \>5-20x speedups writing files -compared to using non-vectorized approaches (Fiona and current I/O support in -GeoPandas). - -You can read these data sources into -`GeoDataFrames`, read just the non-geometry columns into Pandas `DataFrames`, -or even read non-spatial data sources that exist alongside vector data sources, -such as tables in a ESRI File Geodatabase, or antiquated DBF files. - -Pyogrio also enables you to write `GeoDataFrames` to at least a few different -OGR vector data source formats. +compared to using row-per-row approaches (e.g. Fiona). Read the documentation for more information: [https://pyogrio.readthedocs.io](https://pyogrio.readthedocs.io/en/latest/). -WARNING: Pyogrio is still at an early version and the API is subject to -substantial change. Please see [CHANGES](CHANGES.md). - ## Requirements Supports Python 3.9 - 3.13 and GDAL 3.4.x - 3.9.x. @@ -52,9 +46,9 @@ for more information. ## Supported vector formats -Pyogrio supports some of the most common vector data source formats (provided -they are also supported by GDAL/OGR), including ESRI Shapefile, GeoPackage, -GeoJSON, and FlatGeobuf. +Pyogrio supports most common vector data source formats (provided they are also +supported by GDAL/OGR), including ESRI Shapefile, GeoPackage, GeoJSON, and +FlatGeobuf. Please see the [list of supported formats](https://pyogrio.readthedocs.io/en/latest/supported_formats.html) for more information. @@ -64,7 +58,7 @@ for more information. Please read the [introduction](https://pyogrio.readthedocs.io/en/latest/supported_formats.html) for more information and examples to get started using Pyogrio. -You can also check out the the [API documentation](https://pyogrio.readthedocs.io/en/latest/api.html) +You can also check out the [API documentation](https://pyogrio.readthedocs.io/en/latest/api.html) for full details on using the API. ## Credits