-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Categorical Arithmetic/Comparison Rules #19513
Comments
A case derived from
|
there are some issues about this already. see if you can find them. |
Closest issues with categorical label are #18050, #8995, neither of which seem to differentiate between the behavior of @TomAugspurger any thoughts on the One True Implementation for these operations? |
No strong thoughts, but I suppose a decent heuristic is "what would the result be if NumPy had a categorical dtype?" In that case, let's just do the same as
Does that make sense? |
I should clarify. The question isn't what the return type should be, but which operations are allowed at all. The easy (i.e. pretty obviously wrong ATM) part is cases where
The less obvious part is what casting rules are allowed. ATM
The goal here is to have one implementation of each of these methods (ideally in Categorical) and then have the other classes wrap that instead of re-implementing and risking having the logic diverge. |
Categorical is prob a bit strict about these checks ATM. Categorial and CI should behave exactly the same. These should all (Series, Cat, CI) should all respect ordered (IOW they have to be the same). for comparison purposes. |
It looks like |
One more: What should @jschendel are you the appropriate person to ping on this? |
I don't think I have any type of definitive say on this; mostly worked on categorical vs. categorical comparisons. I agree that this probably shouldn't be supported. I suppose you could perform the operation on the underlying category values, but it seems like you'd run into ambiguous corner cases pretty quickly. For example, would you actually want to allow add/sub if the categories were ordered with a non-standard ordering, e.g. Also doesn't look like add/sub at the scalar level is implemented for |
The comparison in the first comment all raise the same error now. Could use a test if there are none.
|
Closing as fixed. |
Series[not-categorical] > CategoricalIndex is inconsistent with the reversed operation. Which one is canonical?
I'm guessing the right thing to do is to a) have
Series[categorical].__op__
wrapCategoricalIndex.__op__
, and b) haveSeries[non-categorical]
to dispatch to reversed-op foris_categorical_dtype(other)
, want to confirm before making a PR.The text was updated successfully, but these errors were encountered: