Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pin correct names to wrapped Index comparison methods #18397

Merged
merged 2 commits into from
Nov 22, 2017

Conversation

jbrockmendel
Copy link
Member

Currently:

>>> pd.Index.__eq__.__name__
'_evaluate_compare'

PR:

>>> pd.Index.__eq__.__name__
'__eq__'

Same for subclasses and other comparison methods.

On the side: use property decorator for DatetimeIndex.freq to avoid leaving _get_freq and _set_freq in the namespace.

  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@@ -3890,6 +3890,7 @@ def _evaluate_compare(self, other):
except TypeError:
return result

_evaluate_compare.__name__ = '__{name}__'.format(name=op.__name__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have a function in core/generic for exactly this

(and sets properly in py2/3 and such)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closest I can think of is pd.compat.bind_method. Is that what you're referring to? Wrapped method names are set like this in e.g. indexes.datetimes._field_accessor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use pandas.compat.set_function_name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's going to get into a circularity problem. set_function_name requires cls, but in several cases this is being called before the class has been defined in the namespace.

Scroll up to line 50 in timedeltas for an example of the status quo usage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see how its done in generic.py, instead you should change these to a factor function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I'll do it, but under protest... those classmethods to _add_foo_methods are among my least favorite design patterns.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are already a well established pattern

and they handle the edge cases

eg qualname

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In indexes.base.Index the func.__name__ = name pattern is used in _add_logical_methods, _add_logical_methods_disabled, _add_numeric_methods_disabled, _add_numeric_methods_add_sub_disabled. In case you want to open an issue to change those.

index = indices
assert index.__eq__.__name__ == '__eq__'
assert index.__ne__.__name__ == '__ne__'
assert index.__le__.__name__ == '__le__'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parametrize

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I guess.

@jreback jreback added the Compat pandas objects compatability with Numpy or Python functions label Nov 21, 2017
@codecov
Copy link

codecov bot commented Nov 21, 2017

Codecov Report

Merging #18397 into master will decrease coverage by 0.02%.
The diff coverage is 94.44%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18397      +/-   ##
==========================================
- Coverage   91.36%   91.34%   -0.03%     
==========================================
  Files         164      164              
  Lines       49721    49736      +15     
==========================================
+ Hits        45429    45432       +3     
- Misses       4292     4304      +12
Flag Coverage Δ
#multiple 89.14% <94.44%> (-0.01%) ⬇️
#single 39.63% <88.88%> (-0.04%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/datetimes.py 95.49% <100%> (+0.01%) ⬆️
pandas/core/indexes/category.py 97.2% <100%> (ø) ⬆️
pandas/core/indexes/timedeltas.py 91.17% <100%> (+0.03%) ⬆️
pandas/core/indexes/period.py 92.94% <100%> (+0.04%) ⬆️
pandas/core/indexes/base.py 96.43% <33.33%> (ø) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/series.py 94.84% <0%> (-0.19%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️
pandas/core/internals.py 94.47% <0%> (-0.08%) ⬇️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8d04daf...c36e870. Read the comment docs.

@jbrockmendel
Copy link
Member Author

I'm actually coming around to being more OK with the set_function_name pattern. Writing a test that iterates over all classes in the pd namespace and all methods in those class namespaces. Where would such a test belong?

@jreback
Copy link
Contributor

jreback commented Nov 21, 2017

for everything u can out in test_api

@jbrockmendel
Copy link
Member Author

OK, I'll do that separately since a bunch of those fail at the moment, will need to be patched in a bunch of places. This should be good-to-go on its own.

__gt__ = _td_index_cmp('__gt__')
__le__ = _td_index_cmp('__le__')
__ge__ = _td_index_cmp('__ge__')
@classmethod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we define the _add_comparison_method to take a function and live in indexes/base? to avoid this repeated code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to this. It isn't entirely trivial because it references DatetimeIndex, TimedeltaIndex, super... I'll take a look at how much datetimelike-specific logic is there; it may be more at home in indexes.datetimelike.

For now I'd advocate closing this fairly straightforward fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok pls add to list (opening an issue is ok too, e.g. if its a bigger thing, not likely to get to anytime soon, etc).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah there's definitely some logic shared between the timedeltas and datetimes version, but its a little bit deceptive in part because _to_m8 is defined differently in each module.

@jreback jreback added this to the 0.22.0 milestone Nov 22, 2017
@jreback jreback merged commit b6b9f3f into pandas-dev:master Nov 22, 2017
@jbrockmendel jbrockmendel deleted the set_freq branch November 22, 2017 05:52
@jbrockmendel
Copy link
Member Author

shouldn't we define the _add_comparison_method to take a function and live in indexes/base? to avoid this repeated code?

I've got an implementation of roughly this. Long story short: we're going to want to solve #18376 before trying to de-repeat this code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants