-
-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Estimators fit with dataframes cause UserWarnings on scikit-learn 1.0 #858
Comments
Thanks for the report. Since we don't really have much maintenance bandwidth here, I think we should drop support for scikit-learn<1.0, get the tests passing with scikit-learn 1.0, and get a release out. I should have a bit of time to work on this in ~2 weeks. |
Hi Tom - I could probably spent some time working on this, but wanted to check first. Do you think the general approach of dropping array coercion is correct? |
Yep. Anywhere scikit-learn preserves dataframes, we should too. We have a few places where we explicitly preserve dataframes where scikit-learn didn't (search
|
Done in #863. I'm going to make a release now. 2021.10.17 is up on PyPI now if you want to try. |
Thanks Tom!
…________________________________
From: Tom Augspurger ***@***.***>
Sent: Saturday, October 16, 2021 12:49:59 PM
To: dask/dask-ml ***@***.***>
Cc: Mike McCarty ***@***.***>; Author ***@***.***>
Subject: Re: [dask/dask-ml] Estimators fit with dataframes cause UserWarnings on scikit-learn 1.0 (#858)
Done in #863<#863>. I'm going to make a release now.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#858 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAEY2GU6ROHUGW6QGWBJOA3UHGULPANCNFSM5E73NIUA>.
|
What happened:
Test failures when fitting sklearn estimators with dataframes. As of
scikit-learn=1.0
, all estimators storefeature_names_in_
when fitted on dataframes and column name consistency checks issue aFutureWarning
when column names are not consistent with theX
columns used to fit.dask-ml
's pytest configuration fails tests with sklearn warnings.What you expected to happen:
Tests to pass with
scikit-learn=1.0
anddask-ml
should be updated to hand dataframes contently withscikit-learn>=1.0
Minimal Complete Verifiable Example:
From tests/test_partial.py, one of the failing tests.
Should result in the following
Anything else we need to know?:
In this case, the problem comes from the way dataframes are coerced to arrays in partial.predict. In
scikit-learn=1.0
, themodel
is expectingX
to be a dataframe.Environment:
The text was updated successfully, but these errors were encountered: