-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REGR: Restore _constructor_from_mgr to pass manager object to constructor #54922
REGR: Restore _constructor_from_mgr to pass manager object to constructor #54922
Conversation
The motivation is that geopandas does something involving inference on object dtypes differently if it gets a Manager object? Can you elaborate on that? |
Bumping off the milestone, assuming the release happens tomorrow. |
CI is failing, bumping off the release. |
@lithomas1 if you bump something (at least for regressions), can you please bump it to the next bug-fix milestone (although I assume in general for PRs that's not a problem, as long as you bump issues to the next milestone instead of removing the milestone. So I should have opened an issue for this one ;)) |
@jbrockmendel yes, whenever we get a manager, we don't do any inference, while if we get object dtype array, we do some inference (and then potentially fallback to DataFrame instead of GeoDataFrame in case there are no geometries). See also my geopandas PR related to this: geopandas/geopandas#3046 But that's geopandas specific (and something we can handle by overriding the constructor from mgr), while this PR actually addresses a more general issue: with your change in #53871, creating a Series or DataFrame object with Essentially, this is the same as #55607, where I think, when passing (subclassed) DataFrame/Series objects to the public constructor ( I added a generic test that demonstrates this ( |
I am adding a bunch of filterwarnings here, while of course we should fix those cases properly. But I decided to go with the filtering here, because firstly those warnings don't yet exist on 2.1.x and so properly fixing them isn't need to backport. And secondly, we then also have a bit more time to figure out the best way we want to have subclasses deal with this (eg Brock's comment at https://github.com/geopandas/geopandas/pull/3046/files#r1371974580 might mean we want to tweak a bit the way that subclasses interact with this). I am also not sure whether we actually want to let all subclasses deal with this, instead of rather doing it as a breaking change in eg 3.0 (or let subclasses handle a mgr in their main constructor) |
To illustrate the issue with our own Series:
So our Series |
no objection to getting this in for 2.1.x, can figure out a longer-term plan later Longer-term, I think the "._from_mgr result is not fully-initialized" issue can be addressed by 1) pinning _name in Series._from_mgr, 2) telling subclasses not to use _from_mgr directly, and 3) replacing usages of self._from_mgr with hard-coded Series/DataFrame._from_mgr. |
Owee, I'm MrMeeseeks, Look at me. There seem to be a conflict, please backport manually. Here are approximate instructions:
And apply the correct labels and milestones. Congratulations — you did some good work! Hopefully your backport PR will be tested by the continuous integration and merged soon! Remember to remove the If these instructions are inaccurate, feel free to suggest an improvement. |
…pass manager object to constructor
Manual backport -> #55704 |
See #53871 (comment) for context. This reverts the refactoring changes of the
_constructor_.._from_mgr
methods of #53871 to the state of the original PR #52132 that added those new methods.This does keep the other changes of that PR (i.e. using
_constructor_from_mgr
in more places). @jbrockmendel mentioned he got some AttributeErrors related to_name
not yet being defined, but at the moment the tests seem to pass anyway.Closes #55120