Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timestamp comparison inconsistency #22017

Closed
jbrockmendel opened this issue Jul 22, 2018 · 1 comment
Closed

Timestamp comparison inconsistency #22017

jbrockmendel opened this issue Jul 22, 2018 · 1 comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug Datetime Datetime data dtype

Comments

@jbrockmendel
Copy link
Member

Trying to make align the behavior of DataFrame/Series ops, a sticking point turns out to be Timestamp comparisons:

ts = pd.Timestamp.now()
df = pd.DataFrame({'A': [1.0, 2.0],
                   'B': [1, 2],
                   'C': ['foo', 'bar']})

>>> df < ts
      A     B     C
0  True  True  True
1  True  True  True
>>> ts < df
      A     B     C
0  True  True  True
1  True  True  True

So first off, this behavior is just nuts. But let's look at Series behavior:

df['A'] < ts
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/site-packages/pandas/core/ops.py", line 1283, in wrapper
    res = na_op(values, other)
  File "/usr/local/lib/python3.7/site-packages/pandas/core/ops.py", line 1167, in na_op
    result = method(y)
  File "pandas/_libs/tslibs/timestamps.pyx", line 170, in pandas._libs.tslibs.timestamps._Timestamp.__richcmp__
TypeError: Cannot compare type 'Timestamp' with type 'float'

>>> df['B'] == ts
0    False
1    False
Name: B, dtype: bool

>>> df['B'] > ts
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/site-packages/pandas/core/ops.py", line 1283, in wrapper
    res = na_op(values, other)
  File "/usr/local/lib/python3.7/site-packages/pandas/core/ops.py", line 1167, in na_op
    result = method(y)
  File "pandas/_libs/tslibs/timestamps.pyx", line 170, in pandas._libs.tslibs.timestamps._Timestamp.__richcmp__
TypeError: Cannot compare type 'Timestamp' with type 'int'

>>> df['C']
0    foo
1    bar
Name: C, dtype: object
>>> df['C'] <= ts
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/site-packages/pandas/core/ops.py", line 1283, in wrapper
    res = na_op(values, other)
  File "/usr/local/lib/python3.7/site-packages/pandas/core/ops.py", line 1143, in na_op
    result = _comp_method_OBJECT_ARRAY(op, x, y)
  File "/usr/local/lib/python3.7/site-packages/pandas/core/ops.py", line 1122, in _comp_method_OBJECT_ARRAY
    result = libops.scalar_compare(x, y, op)
  File "pandas/_libs/ops.pyx", line 98, in pandas._libs.ops.scalar_compare
  File "pandas/_libs/tslibs/timestamps.pyx", line 170, in pandas._libs.tslibs.timestamps._Timestamp.__richcmp__
TypeError: Cannot compare type 'Timestamp' with type 'str'

The Series behavior is deferring to the Timestamp implementation, which is Technically Correct. @jreback Are there backward-compat reasons for the existing behavior of DataFrame implementation, or can it go?

If we're OK with raising, then I'm close to being ready to dispatch DataFrame ops to their Series counterparts, at which point we can get rid of BlockManager.eval and Block.eval.

@gfyoung gfyoung added Datetime Datetime data dtype Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug labels Jul 23, 2018
@jbrockmendel
Copy link
Member Author

Closed by #22163.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug Datetime Datetime data dtype
Projects
None yet
Development

No branches or pull requests

2 participants