Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Make dict iterators real iterators, provide "next()" in Python 2 #12299

Closed
wants to merge 2 commits into from

Conversation

toobaz
Copy link
Member

@toobaz toobaz commented Feb 11, 2016

Currently, compat.itervalues and friends are not real iterators (under Python 3): compare

In [3]: next(six.itervalues({1:2}))
Out[3]: 2

with

In [4]: next(pd.compat.itervalues({1:2}))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-177f5d3e9571> in <module>()
----> 1 next(pd.compat.itervalues({1:2}))

TypeError: 'dict_values' object is not an iterator

This PR fixes this (it drops support for **kwargs: although it is supported by six, I fail to see what's the utility, and it's anyway apparently not used in the pandas codebase - but if I'm missing something, it is trivial to reintroduce it) and provides next() in Python 2.

@jreback
Copy link
Contributor

jreback commented Feb 12, 2016

Why don't we just use the six code almost directly?

if PY3:
    def iterkeys(d, **kw):
        return iter(d.keys(**kw))

    def itervalues(d, **kw):
        return iter(d.values(**kw))

    def iteritems(d, **kw):
        return iter(d.items(**kw))

    def iterlists(d, **kw):
        return iter(d.lists(**kw))

    viewkeys = operator.methodcaller("keys")

    viewvalues = operator.methodcaller("values")

    viewitems = operator.methodcaller("items")
else:
    def iterkeys(d, **kw):
        return d.iterkeys(**kw)

    def itervalues(d, **kw):
        return d.itervalues(**kw)

    def iteritems(d, **kw):
        return d.iteritems(**kw)

    def iterlists(d, **kw):
        return d.iterlists(**kw)

    viewkeys = operator.methodcaller("viewkeys")

    viewvalues = operator.methodcaller("viewvalues")

    viewitems = operator.methodcaller("viewitems")

@toobaz
Copy link
Member Author

toobaz commented Feb 12, 2016

Makes sense, except that you could not use anymore compat.iteritems(NDFrame), given that NDFrame.iter is not callable (while the six Python 3 version expects it to be so, since dict.iter is).

But still the right thing to do is probably to (also) change each compat.iteritems(NDFrame) to iter(NDFrame.iteritems()). There aren't tons of them.

(then, I still fail to understand the **kw thing, but never mind)

@jreback
Copy link
Contributor

jreback commented Feb 12, 2016

yeh the compat was done a long time ago, its possible the actual six code was changed since then (to be better) and we have the original. ok with these kinds of changes (e.g. they do a slightly better check for next compat)

@kawochen
Copy link
Contributor

the keyword argument passing is needed to generate the correct error messages.

@toobaz
Copy link
Member Author

toobaz commented Feb 12, 2016

... I still don't understand: which error messages?

@kawochen
Copy link
Contributor

when you pass a keyword argument, the real function gets to tell you that it doesn't need any. the other error messages still are wrong tho. but these things are not super important

@toobaz
Copy link
Member Author

toobaz commented Feb 12, 2016

Got it!

@toobaz toobaz force-pushed the compat_iters branch 4 times, most recently from cfdf86b to 7643439 Compare February 14, 2016 14:09
@toobaz
Copy link
Member Author

toobaz commented Feb 14, 2016

Above, I clearly made a mistake writing NDFrame.iter rather than NDFrame.items, and anyway it is not NDFrame, but only Series and DataFrame.

That said, I ended up

  • making my compat code semantically equivalent to six - but tell me if you also want it more verbose
  • changing compat.iter*(obj) to obj.iter*() only when obj was an NDFrame with ndim (possibly) >2

What I didn't do but may be worth doing in the medium/long run:

  • changing compat.iter*(obj) to obj.iter*() whenever obj is an NDFrame (and in the short run, not using the former in new code)
  • removing Series.items() and DataFrame.items() (they are aliases for iteritems(), not even documented)

It is true that one driver of the decision would be to avoid problems with features (Panel, in which items is an axis) which will be probably deprecated... but I don't see any downside (apart from a minor one: we are sticking to Python 2-sounding method names).

@toobaz
Copy link
Member Author

toobaz commented Feb 21, 2016

Is anything expected from me here?

@jreback
Copy link
Contributor

jreback commented Feb 21, 2016

write out all of the iter methods there strongly a couple instead of evaluating them at run time

@toobaz
Copy link
Member Author

toobaz commented Mar 3, 2016

ping

@jreback jreback added this to the 0.18.0 milestone Mar 3, 2016
@jreback jreback closed this in 9313089 Mar 3, 2016
@jreback
Copy link
Contributor

jreback commented Mar 3, 2016

thanks @toobaz

@toobaz toobaz deleted the compat_iters branch March 3, 2016 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants