-
Notifications
You must be signed in to change notification settings - Fork 794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't call len on DataFrame interchange protocol object #3111
Conversation
Thanks Jon! We're getting closer. Running on this PR I now see no error with import altair as alt
import ibis
import polars as pl
data = ibis.memtable(
{'x': ['A', 'B', 'C', 'D', 'E'],
'y': [5, 3, 6, 7, 2]}
)
chart = alt.Chart(data).mark_bar().encode(
x='x:O',
y='y:Q',
)
chart.save("chart.html") Output: Traceback (most recent call last):
File "/home/jcristharif/Code/altair/test.py", line 15, in <module>
chart.save("chart.html")
File "/home/jcristharif/Code/altair/altair/vegalite/v5/api.py", line 1066, in save
result = save(**kwds)
^^^^^^^^^^^^
File "/home/jcristharif/Code/altair/altair/utils/save.py", line 189, in save
perform_save()
File "/home/jcristharif/Code/altair/altair/utils/save.py", line 127, in perform_save
spec = chart.to_dict(context={"pre_transform": False})
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jcristharif/Code/altair/altair/vegalite/v5/api.py", line 2677, in to_dict
return super().to_dict(
^^^^^^^^^^^^^^^^
File "/home/jcristharif/Code/altair/altair/vegalite/v5/api.py", line 892, in to_dict
copy.data = _prepare_data(original_data, context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jcristharif/Code/altair/altair/vegalite/v5/api.py", line 114, in _prepare_data
data = _pipe(data, data_transformers.get())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jcristharif/miniconda3/envs/altair/lib/python3.11/site-packages/toolz/functoolz.py", line 628, in pipe
data = func(data)
^^^^^^^^^^
File "/home/jcristharif/miniconda3/envs/altair/lib/python3.11/site-packages/toolz/functoolz.py", line 304, in __call__
return self._partial(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jcristharif/Code/altair/altair/vegalite/data.py", line 23, in default_data_transformer
return curried.pipe(data, limit_rows(max_rows=max_rows), to_values)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jcristharif/miniconda3/envs/altair/lib/python3.11/site-packages/toolz/functoolz.py", line 628, in pipe
data = func(data)
^^^^^^^^^^
File "/home/jcristharif/miniconda3/envs/altair/lib/python3.11/site-packages/toolz/functoolz.py", line 304, in __call__
return self._partial(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jcristharif/Code/altair/altair/utils/data.py", line 111, in limit_rows
if pa_table.num_rows > max_rows:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: '>' not supported between instances of 'int' and 'NoneType'
|
Tested as follows: import altair as alt
import pyarrow as pa
import polars as pl
import pandas as pd
import vaex
import ibis
def chart(source, title):
return (
alt.Chart(source, height=20, title=title)
.mark_bar()
.encode(x="x:O", y="y:Q")
)
data = {"x": [1, 2, 3], "y": [4, 5, 6]}
pa_table = pa.table(data)
df_polars = pl.DataFrame(data)
df_pandas = pd.DataFrame(data)
df_vaex = vaex.from_pandas(df_pandas)
df_ibis = ibis.memtable(data)
dataframes = {
"pyarrow": pa_table,
"polars": df_polars,
"pandas": df_pandas,
"vaex": df_vaex,
"ibis": df_ibis
}
chart = alt.hconcat(*[chart(dataframes[df], df) for df in dataframes])
# raise MaxRowsError
with alt.data_transformers.enable("default", max_rows=2):
try:
chart.to_dict()
except alt.MaxRowsError as e:
print(e) The number of rows in your dataset is greater than the maximum allowed (2).
Try enabling the VegaFusion data transformer which raises this limit by pre-evaluating data
transformations in Python.
>> import altair as alt
>> alt.data_transformers.enable("vegafusion")
Or, see https://altair-viz.github.io/user_guide/large_datasets.html for additional information
on how to plot large datasets. # render chart with standard `max_rows` (5000) from different sources
chart |
Good catch @jcrist! Something is still missing |
Related vega/vl-convert#31 (comment) and #2724 |
Thanks for checking, I missed a None check. Fixed in 3615dae. For context, when saving to a file we disable the |
Checked again and tests from above work correctly. Thanks @jonmmease for this PR, thanks @jcrist for the review! |
See #3110.
We can't call
len(data)
whendata
is a DataFrame interchange protocol object. It happens to work for PyArrow and Polars, which is why we didn't catch it before, but that's not part of the standard.