Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timedelta always returning False for equality tests for incompatible types #20829

Closed
Sup3rGeo opened this issue Apr 26, 2018 · 11 comments · Fixed by #21394
Closed

Timedelta always returning False for equality tests for incompatible types #20829

Sup3rGeo opened this issue Apr 26, 2018 · 11 comments · Fixed by #21394
Labels
Bug Timedelta Timedelta data type
Milestone

Comments

@Sup3rGeo
Copy link
Contributor

Sup3rGeo commented Apr 26, 2018

Code Sample, a copy-pastable example if possible

import datetime
import pandas

class CustomClass:

    def __init__(self):
        pass
       
    def __eq__(self, other):
        raise Exception("Custom Class eq")

        
custom = CustomClass()
var1 = datetime.timedelta(seconds=1)
var2 = pandas.Timedelta("1s")    

# Following code raises CustomClass exception
var1 == custom

# Following code returns False, does not call CustomClass.__eq__
var2 == custom

Problem description

I am trying to implement a version of pytest.approx for Timedeltas, which basically returns a class with a custom __eq__ implementation.

`Timedelta("5s") == AproxTimedelta(...)

However it does not work with pandas Timedelta because it always return False, so the __eq__ of AproxTimedelta object is never called.

This is not the behavior for python datetime timedelta, as can be seen in the example script.

Expected Output

Exception("Custom Class eq") raised for both cases

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.22.0
pytest: 3.5.0
pip: 10.0.0
setuptools: 38.5.1
Cython: 0.28.2
numpy: 1.14.1
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.3.1
sphinx: 1.7.1
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.1
openpyxl: 1.7.0
xlrd: 1.1.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Apr 26, 2018

Can you try on master?

custom = CustomClass()
var1 = datetime.timedelta(seconds=1)
var2 = pandas.Timedelta("1s")

# Following code raises CustomClass exception
var1 == custom

# Following code returns False, does not call CustomClass.__eq__
var2 == custom

## -- End pasted text --
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-1-82804cdda1a5> in <module>()
     16
     17 # Following code raises CustomClass exception
---> 18 var1 == custom
     19
     20 # Following code returns False, does not call CustomClass.__eq__

<ipython-input-1-82804cdda1a5> in __eq__(self, other)
      8
      9     def __eq__(self, other):
---> 10         raise Exception("Custom Class eq")
     11
     12

Exception: Custom Class eq

@Sup3rGeo
Copy link
Contributor Author

Same thing on master.

Note that in the example we want exceptions both in var1 == custom and var2 == custom.

@TomAugspurger
Copy link
Contributor

Ah sorry I didn't get to that because of the first exception.

cc @jbrockmendel whose been active here recently. I think this is may be non-trivial to support.

@TomAugspurger TomAugspurger added the Timedelta Timedelta data type label Apr 26, 2018
@jbrockmendel
Copy link
Member

Tom is right, this would be non-trivial to support. There is no branch of Timedelta.__eq__ (actually Timedelta.__richcmp__) that returns NotImplemented.

That said, if you were careful to always write custom == td instead of td == custom then it custom.__eq__ would get called first instead of td.__eq__.

@Sup3rGeo
Copy link
Contributor Author

@jbrockmendel I was taking a look at the code, shouldn't it be just a matter of changing this branch:

else:
if op == Py_EQ:
return False
elif op == Py_NE:
return True
raise TypeError('Cannot compare type {!r} with type ' \
'{!r}'.format(type(self).__name__,
type(other).__name__))

To return NotImplemented ?

 else: 
     return NotImplemented

@jbrockmendel
Copy link
Member

It might be reasonable to return NotImplemented instead of raising TypeError (though tracking down the affected tests and changing the expected error message would be a hassle), but the Py_EQ and Py_NE branches we wouldn't want to change because we need to accommodate non-custom classes.

Also if ApproxTimedelta happens to subclass timedelta, then the comparison wouldn't go through the linked code, but would instead go through lines 674 and 704.

Have you considered using mock to patch Timedelta.__eq__? That might be easier on your end.

@Sup3rGeo
Copy link
Contributor Author

but the Py_EQ and Py_NE branches we wouldn't want to change because we need to accommodate non-custom classes.

Could you elaborate a bit more on than? What would be an example?

Have you considered using mock to patch Timedelta.eq? That might be easier on your end.

Yep that's a good idea for time being!

Another question related to it: If I also happen to have a custom Timedelta subclass (for instance allowing to sum a timedelta with "1s" directly). Is there an easy way to make pandas use this custom when working with TimedeltaIndexes?

@jbrockmendel
Copy link
Member

Could you elaborate a bit more on than? What would be an example?

Timedelta(whatever) == 6 --> False
Timedelta(whatever) != "foo" --> True

Is there an easy way to make pandas use this custom when working with TimedeltaIndexes?

Try pd.Index([custom_timedelta], dtype='o')

@Sup3rGeo
Copy link
Contributor Author

I thought python would fallback automatically to identity comparison if both classes returned NotImplemented for a comparison.

Is it a different situation for pandas Timedelta?

@jbrockmendel
Copy link
Member

@Sup3rGeo Off the top of my head I can't think of any reason why Timedelta would behave differently, but the check-for-eq-then-ne-then-raise pattern is present in a handful of places in pandas.

Maybe try making the substitution you have in mind and see if it causes any test failures?

@Sup3rGeo
Copy link
Contributor Author

Sup3rGeo commented Jun 1, 2018

Will do it!

Just as quick snippet:

import pandas as pd

def CustomClass(object):
    def __eq__(self, other):
        print("Custom class __eq__")
        return NotImplemented

pd.Timedelta("1s") == 6 # False
pd.Timedelta("1s") == "foo" # False
pd.Timedelta("1s") == CustomClass() # False

def new_eq(self,other):
    return NotImplemented

pd.Timedelta.__eq__ = new_eq

pd.Timedelta("1s") == 6 # still False
pd.Timedelta("1s") == "foo" # still False
pd.Timedelta("1s") == CustomClass() # still False, but prints custom class eq

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants