-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optional switch to non-nullable dtypes in to_dataframe #1345
Comments
To make things worse, not all widely-used machine learning libraries can handle pandas extension dtypes. I just figured out, the training of a lightgbm Regressor ist not possible with such input. |
Thank you so much for the feedback! I hope that the nullable dtypes can stay as the default, as they more accurately reflect the BigQuery data types, but you make some good points that there are still use cases where the old behavior would be desirable. |
Related: #954 Support for string[pyarrow] dtype (edit: fixed issue number) I'm thinking we add Alternative 1: expose typemapper directly, but doubtful that a lambda function that inspects an arrow schema would be all that understandable to the average pandas user. Alternative 2: expose |
With #1529, you'll be able to explicitly set int_dtype to |
The changes have been merged. |
Release 3.0.0 introduced the use of nullable Int64 and boolean dtypes (pandas extension dtypes) (#786)
However the pandas extension dtypes are not widely supported across the pandas api. This might lead to issues using numba in pandas operations.
Please see my bug report in the pandas project for example:
pandas-dev/pandas#46867
I would propose to introduce an optional switch to pre-3.0.0 behaviour.
The text was updated successfully, but these errors were encountered: