Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Cannot convert pd.DataFrame with geometry cells to pa.Table #29844

Open
asfimport opened this issue Oct 8, 2021 · 3 comments
Open

Comments

@asfimport
Copy link
Collaborator

Example: 

import geopandas as gpd
import pandas as pd
import pyarrow as pa


path = gpd.datasets.get_path("naturalearth_lowres")
data = gpd.read_file(path)
df = pd.DataFrame(data)
table = pa.Table.from_pandas(df)
print(table)

Throws the following error:

Traceback (most recent call last):
 File "/Users/Henrikh/Desktop/tmp.py", line 8, in <module>
 table = pa.Table.from_pandas(df)
 File "pyarrow/table.pxi", line 1553, in pyarrow.lib.Table.from_pandas
 File "/usr/local/lib/python3.9/site-packages/pyarrow/pandas_compat.py", line 594, in dataframe_to_arrays
 arrays = [convert_column(c, f)
 File "/usr/local/lib/python3.9/site-packages/pyarrow/pandas_compat.py", line 594, in <listcomp>
 arrays = [convert_column(c, f)
 File "/usr/local/lib/python3.9/site-packages/pyarrow/pandas_compat.py", line 581, in convert_column
 raise e
 File "/usr/local/lib/python3.9/site-packages/pyarrow/pandas_compat.py", line 575, in convert_column
 result = pa.array(col, type=type_, from_pandas=True, safe=safe)
 File "pyarrow/array.pxi", line 302, in pyarrow.lib.array
 File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array
 File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type
 File "pyarrow/error.pxi", line 120, in pyarrow.lib.check_status
pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion failed for column geometry with type geometry')

 

Reporter: Henrikh Kantuni
Watchers: Rok Mihevc / @rok

Note: This issue was originally created as ARROW-14267. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Rok Mihevc / @rok:
I believe geometry is a pandas extension array (https://jorisvandenbossche.github.io/blog/2019/08/13/geopandas-extension-array-refactor/) and currently cannot be automatically converted to an arrow extension array. But I could be wrong.  @jorisvandenbossche  will definitely know more. 

@fgenoese
Copy link

This issue seems to be connected:
streamlit/streamlit#1002 (comment)

Basically, when loading a geojson with geopandas the shape fails to render on altair / streamlit:

pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion failed for column geometry with type geometry')

Here is a demo with code:
https://share.streamlit.io/fgenoese/st_bugreports/main/geopandas_map.py

Is this a pyarrow or geopandas issue?

@VaasuDevanS
Copy link

Any update / workaround on this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants