Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Consistent division by zero behavior for Index/Series #27321

Merged
merged 29 commits into from
Jul 11, 2019

Conversation

jbrockmendel
Copy link
Member

We have two very-similar functions for masking division by zero for Index vs Series. The Index version is the more correct of the two, so I'm trying to make the Series one behave more like the Index one, eventually just using the Index one for both.

The hard part here is that IntegerArray, SparseArray, and SparseSeries have their own arithmetic implementations, and we test that they behave the same as Series. Several attempts to patch these implementations have failed (and motivated #27302, which this is rebased on top of), so for now this PR patches the tests instead of the implementations.

I think the IntegerArray implementation should be relatively easy to patch. SparseArray I'll need some help (cc @TomAugspurger) with __setitem__ and/or putmask.

@jreback jreback added Compat pandas objects compatability with Numpy or Python functions Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Jul 10, 2019
@jreback jreback added this to the 0.25.0 milestone Jul 10, 2019
pandas/core/sparse/frame.py Outdated Show resolved Hide resolved
@@ -424,6 +426,11 @@ def _compare_to_dense(a, b, da, db, op):
sparse_result = op(a, b)
dense_result = op(da, db)

if op in [operator.floordiv, ops.rfloordiv] or isinstance(op, LambdaType):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this not just callable(op)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we don't want e.g. operator.add

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k can you add that as a comment (future PR ok), as its not obvious

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

@@ -332,6 +332,9 @@ def test_ser_divmod_zero(self, dtype1, any_real_dtype):
right = pd.Series([0, 2]).astype(dtype2)

expected = left // right, left % right
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in future PR can you give some comments on what you ar doing here (filling np.inf on the integer divide)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

@jreback
Copy link
Contributor

jreback commented Jul 10, 2019

k looks fine (except for the comments which can be addressed as a followup), but we need a note for the changes in Series yes? (e.g. they are very edge casey but need a mention)

@jbrockmendel
Copy link
Member Author

will add note. do we want to try to fix the IntegerArray and SparseArray behavior here too? IntegerArray I think I can figure out but SparseArray I've beat my head against and got nowhere.

@jreback
Copy link
Contributor

jreback commented Jul 11, 2019

will add note. do we want to try to fix the IntegerArray and SparseArray behavior here too? IntegerArray I think I can figure out but SparseArray I've beat my head against and got nowhere.

no let's do as followups.

@jreback
Copy link
Contributor

jreback commented Jul 11, 2019

lgtm ping on green (note that I just merged a patch to make the numpy dev branch pass) so you may need to rebase again.

@jbrockmendel
Copy link
Member Author

ping

@jreback jreback merged commit 3885575 into pandas-dev:master Jul 11, 2019
@jbrockmendel jbrockmendel deleted the divzero3 branch July 11, 2019 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG/API Series/Index integer floordiv/mod inconsistency
2 participants