-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: Standardize values coercion during Block initialization #19492
Comments
+1
If Side-note bc I don't want to pile on extraneous suggestions in #19268: the recent addition of |
We also need to take into account that for 'real' extension blocks (not ones we internally still adapt a little bit), can never do such "maybe coerce" logic, as they infer the It would probably be cleaner to make sure for our internal extension blocks we also only pass it correct data (so no "maybe coerce" is needed), but that might require a lot of cleaning up internally / many difficult to find cases.
I would then rather don't add dtype everywhere, and make sure we use |
Re extension blocks (internal or external), I'm finding it might be helpful to define an |
@jorisvandenbossche I don't want to de-rail the thread here, so we can discuss this in #19444 if necessary. The motivation is that to make Frame/Series/Index arithmetic/comparison ops consistent (#18824) they will need to share a single implementation. Focusing on the datetime/timedelta[/period] cases, the idea is to adapt the index implementations since they perform type/timezone checks most carefully/correctly. The sticking point is that they call Index/DatetimeIndex/TimedeltaIndex, which we will need to replace with something more generic if we're mixing the implementations into the blocks. The solution I've been kicking around is to define something like:
and replace e.g. occurrences of |
That what I wanted to response when reading up to there :-) |
@TomAugspurger closeable? if not, actionable? |
I think it can still be worthwhile to clean up coercing logic in the Block constructors (but don't know if something already happened on that front since the above discussion). |
Thats the point, I think this has been pushed about as far as it can go (until 2D EAs!) |
Can you explain what you mean? What has been changed since the discussion above? (link to PR?) |
I think this can be closed when all |
Splitting off from #19268 (which is starting to give me unicorns fairly often)
After that, all blocks eventually end up calling
Block.__init__
(aside fromScalarBlock
, which we'll ignore).Most of the subclasses init methods are
call super().__init__
It'd be nice to put 1 into favor of a
_maybe_coerce_values
method defined by each block, and remove all the subclasses init methods.The one sticking point will be
DatetimeTZBlock.__init__
, which accepts adtype
parameter that no other block does.pandas/pandas/core/internals.py
Lines 2582 to 2590 in 35812ea
We could maybe push that onto the callers, and then clean things up.
The text was updated successfully, but these errors were encountered: