Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand groupby dispatch whitelist (GH5480) #5604

Merged
merged 1 commit into from
Dec 7, 2013

Conversation

gdraps
Copy link
Contributor

@gdraps gdraps commented Nov 28, 2013

(and create a groupby dispatch blacklist)

closes #5480.

@jtratner
Copy link
Contributor

looks fine to me - @jreback any objections? I think this needs to end up in 0.13

@jreback
Copy link
Contributor

jreback commented Nov 30, 2013

my problem with this is we have a whitelist; by definition anything not on it is blacklist. not a big fan of listing all kinds of methods (bad enough that have it on whitelist), but going to be a problem as there are easy way to maintain this

so why is there a blacklist? and not just disallow not on whitelist - I thought this pr was just going to add some methods to whitelist which is ok

@jreback
Copy link
Contributor

jreback commented Nov 30, 2013

I however do like the idea of testing all methods on the whitelist (and hard coding in the test suite) so that if something is changed the test will fail (also should compare the whitelist to the defined whitelist) and then also test that not on whitelist will fail (with a sample of methods)

@jreback
Copy link
Contributor

jreback commented Nov 30, 2013

remember that the worst case is something is left off of whitelist - you can still do it via apply not just automatic dispatch - as long as most common are on their it is fine

@gdraps
Copy link
Contributor Author

gdraps commented Dec 2, 2013

@jreback, a blacklist was created to block access to some methods while deprecating access to others (my interpretation of the comments in #5480). In summary, the changes in this PR are:

  1. add the following methods to the groupby whitelist:

    'dtypes', 'value_counts', 'mad', 'any', 'all', 'irow', 'take', 'shift', 'tshift', 'ffill', 'bfill', 'pct_change', 'skew', 'corr', 'corrwith', 'cov'
    
  2. create a blacklist to block access to:
    'eval', 'query', to_*

  3. change AttributeError to a DeprecationWarning when accessing methods not in either whitelist or blacklist

Re: testing, I added a new commit to this PR with improved whitelist testing (the full whitelist is hard-coded in the test suite now and compared to the defined whitelist). Since some methods in the whitelist are unique to Series or DataFrame, the whitelist needed to be refactored a bit, but that should clean up erroneous tab completion results (i.e., dtype on a DataFrameGroupBy object).

Let me know if I should back out changes 2 & 3, or if there are any other suggestions.

@jreback
Copy link
Contributor

jreback commented Dec 2, 2013

  1. is fine

I am -1 on blacklist; this is going to be impossible to maintain. users will see the error and see that they are using an unsupported access method. If a method which is not on the list and should be then it can be added (as you are doing in 1), otherwise apply can be used.

so I would back out 2 & 3 (but leave in tests for non-whitelist methods)

@gdraps
Copy link
Contributor Author

gdraps commented Dec 2, 2013

Ok, thanks @jreback. I've backed out 2 & 3, rebased, and squashed the commits.

@jreback
Copy link
Contributor

jreback commented Dec 2, 2013

looks ok to me...@jtratner?

@jreback
Copy link
Contributor

jreback commented Dec 3, 2013

can u add cumcount to list as well (since its a defined method it bypasses this anyhow, but for consistency we have these methods in the whitelist ), this is new method in 0.13

- Create separate whitelists for SeriesGroupBy and DataFrameGroupBy objects
- Improve groupby whitelist testing
@gdraps
Copy link
Contributor Author

gdraps commented Dec 3, 2013

np. updated the pr to include cumcount in the whitelist.

@jreback
Copy link
Contributor

jreback commented Dec 5, 2013

this looks ok..>@jtratner?

jreback added a commit that referenced this pull request Dec 7, 2013
Expand groupby dispatch whitelist (GH5480)
@jreback jreback merged commit 05c9a74 into pandas-dev:master Dec 7, 2013
@gdraps gdraps deleted the groupby-dispatch-part-2 branch January 4, 2014 01:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: value_counts / shift not in groupby dispatch whitelist (0.12-dev)
3 participants