Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-30014: make poll-like selector's modify() method faster #1030

Merged
merged 10 commits into from
Jun 9, 2017

Conversation

giampaolo
Copy link
Contributor

@giampaolo giampaolo commented Apr 7, 2017

http://bugs.python.org/issue30014
This applies to poll(), epoll() and devpoll() selectors, and makes modify() method up to 11.4% faster under heavy loads.

…selector's modify() method instead of un/register()ing the fd
@mention-bot
Copy link

@giampaolo, thanks for your PR! By analyzing the history of the files in this pull request, we identified @cf-natali and @1st1 to be potential reviewers.

@vstinner vstinner changed the title #30014: make selectors.DefaultSelector.modify() faster bpo-30014: make selectors.DefaultSelector.modify() faster Apr 7, 2017
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall change LGTM, but I have some remarks, and I would prefer to see Charles-François Natali approve the change first, since he is the maintainer of the module.

@@ -169,6 +169,8 @@ Optimizations
using the :func:`os.scandir` function.
(Contributed by Serhiy Storchaka in :issue:`25996`.)

* :meth:`selectors.DefaultSelector.modify` on Linux, OSX and Solaris is around
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"DefaultSelector" depends on the platform, I would prefer to mention concrete classes: PollSelector, EpollSelector and DevpollSelector.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Lib/selectors.py Outdated
try:
key = self._fd_to_key[self._fileobj_lookup(fileobj)]
except KeyError:
raise KeyError("{!r} is not registered".format(fileobj)) from None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: please add an empty line for readability.

Lib/selectors.py Outdated
try:
self._epoll.modify(key.fd, selector_events)
except BaseException:
super().unregister(fileobj)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc says that modify() works as unregister+register, so basically the doc is still ok after your change.

Question: is it worth it to mention that modify() unregisters the FD on error, and maybe even write an unit test for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I added a test case for that.

@@ -169,6 +169,9 @@ Optimizations
using the :func:`os.scandir` function.
(Contributed by Serhiy Storchaka in :issue:`25996`.)

* :meth:`selectors.EpollSelector.modify`, :meth:`selectors.PollSelector.modify`
and :meth:`selectors.DevpollSelector.modify` are around 1.5x faster.
(Contributed by Giampaolo Rodola')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Contributed by Giampaolo Rodola' in :issue:`30014`.)

Lib/selectors.py Outdated
try:
key = self._fd_to_key[self._fileobj_lookup(fileobj)]
except KeyError:
raise KeyError("{!r} is not registered".format(fileobj)) from None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f"{fileobj!r} is not registered" would make the shorter and cleaner.

elif self.SELECTOR.__name__ == 'DevpollSelector':
patch = unittest.mock.patch('selectors.select.devpoll')
else:
return unittest.skip("")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unittest.skip() is a decorator. Using raise self.skipTest() or raise unittest.SkipTest() would be more appropriate for this use case.

Misc/NEWS Outdated
@@ -303,6 +303,10 @@ Extension Modules
Library
-------

- bpo-30014: selectors.EpollSelector.modify, selectors.PollSelector.modify
and selectors.DevpollSelector.modify are around 1.5x faster.
(Contributed by Giampaolo Rodola')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contributed by -> Patch by (but you may skip this and only attribute your name in the whatsnew document since you're already a core developer :))

@giampaolo giampaolo changed the title bpo-30014: make selectors.DefaultSelector.modify() faster bpo-30014: make poll-like selector's modify() method faster Jun 7, 2017
@giampaolo
Copy link
Contributor Author

OK, I addressed all comments in the PR.

@giampaolo
Copy link
Contributor Author

@cf-natali

if data != key.data:
changed = True

if changed:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How common is the case of unchanged events and data? Wouldn't be faster (in average) always change key and save a time for setting and checking changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how common it is but _replace() is pretty time consuming so I think it's better to avoid it if we can. Checking changed variable is definitively faster than doing this all the time:

            key = key._replace(data=data)
            self._fd_to_key[key.fd] = key

Lib/selectors.py Outdated
selector_events |= self._EVENT_WRITE
try:
self._selector.modify(key.fd, selector_events)
except BaseException:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why BaseException? _PollLikeSelector.register() uses Exception.

If BaseException is used intentionally, it can be omitted, the bare except has the same effect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, indeed this should be Exception

@@ -234,6 +234,9 @@ Optimizations
expressions <re>`. Searching some patterns can now be up to 20 times faster.
(Contributed by Serhiy Storchaka in :issue:`30285`.)

* :meth:`selectors.EpollSelector.modify`, :meth:`selectors.PollSelector.modify`
and :meth:`selectors.DevpollSelector.modify` may be up to 11.4% faster under
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such precision is unbelievable.

Copy link
Contributor Author

@giampaolo giampaolo Jun 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This number comes from the benchmark script provided in http://bugs.python.org/issue30581. Perhaps it's better to rephrase this as "around 10% faster"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would look better to me.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically the patch looks correct to me (but needs a Misc/NEWS entry). But I don't know whether is is worth to add additional coupling for 10% speedup. If the subclass of _PollLikeSelector implements register() or unregister(), it would need to reimplement also modify().

@@ -234,6 +234,9 @@ Optimizations
expressions <re>`. Searching some patterns can now be up to 20 times faster.
(Contributed by Serhiy Storchaka in :issue:`30285`.)

* :meth:`selectors.EpollSelector.modify`, :meth:`selectors.PollSelector.modify`
and :meth:`selectors.DevpollSelector.modify` may be up to 11.4% faster under
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would look better to me.

@berkerpeksag berkerpeksag dismissed their stale review June 7, 2017 21:33

All of my comments have been addressed.

@giampaolo
Copy link
Contributor Author

I think this is good to go. Any objections?

@vstinner
Copy link
Member

vstinner commented Jun 9, 2017

I think this is good to go. Any objections?

I asked why not replacing expensive "if data != key.data:" with "if data is not key.data:", but I didn't get an answer. I suggest to use cheap "is not" for speed but also avoid weird bugs on strange comparison functions (like math.nan != math.nan? :-p).

@serhiy-storchaka
Copy link
Member

I have no objections.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see my question answered ;-)

Lib/selectors.py Outdated
super().unregister(fileobj)
raise
changed = True
if data != key.data:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparison might be expensive. Would it ok to use "is not"?

It's ok to use "!=" if "is not" wouldn't work.

@giampaolo
Copy link
Contributor Author

I asked why not replacing expensive "if data != key.data:" with "if data is not key.data:", but I didn't get an answer.

Oh sorry, I missed your comment. Mmm... why would that be faster? I think I'm missing something.
FWIW also _BaseSelectorImpl.modify() method uses != instead of is not.

@vstinner
Copy link
Member

vstinner commented Jun 9, 2017

Oh sorry, I missed your comment. Mmm... why would that be faster? I think I'm missing something.

If data is a 1 GB string, data1 != data2 requires to compare all bytes. If the object implements custom eq() method, I don't think that it's a good idea to call it here.

IHMO we should only optimize the obvious case: when the same data object is passed again (but modify is called to modify tracked events for example).

FWIW also _BaseSelectorImpl.modify() method uses != instead of is not.

I suggest to also modify the base method in that case.

@giampaolo
Copy link
Contributor Author

is not is indeed a bit faster:

~/svn/cpython {bpo-30014-selectors-modify-speedup|MERGING}$ ./python -m timeit "a = '1'*1024; b = '1' * 1024 + '2'" "a != b"
1000000 loops, best of 5: 315 nsec per loop
~/svn/cpython {bpo-30014-selectors-modify-speedup|MERGING}$ ./python -m timeit "a = '1'*1024; b = '1' * 1024 + '2'" "a is not b"
1000000 loops, best of 5: 297 nsec per loop

OK, I added this change.

@gvanrossum
Copy link
Member

@serhiy-storchaka I no longer know this code very well -- do you really want my review in addition to all the other reviews you've already had? Is there a specific question you have for me?

@serhiy-storchaka
Copy link
Member

@gvanrossum, nothing specific. Just you were one of the early committers to this module (with Charles-François Natali which seems not available now) and once had committed a patch related to optimization of the modify() method (d0786a1). I though that you can know something missed by other reviewers.

Lib/selectors.py Outdated
try:
key = self._fd_to_key[self._fileobj_lookup(fileobj)]
except KeyError:
raise KeyError("{!r} is not registered".format(fileobj)) from None
if events != key.events:
self.unregister(fileobj)
key = self.register(fileobj, events, data)
elif data != key.data:
elif data is not key.data:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with this change. I've found nothing in the docs constraining the __eq__ sematics of data. In the common cases, where data is either None or unchanged, __eq__ typically still takes a shortcut when comparing the object to itself. So the story about comparing two 1Gb strings seems unlikely -- either it's the same string, in which case it takes a shortcut, or it's two different strings, in which case the comparison bails at the first difference. In case the data is e.g. a list, and the new data is a different list containing the same values, this "optimization" would actually be a slow-down. In general it smells superstitious. I think the code should show the intended semantics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for chiming in. I changed this back.

selector_events |= self._EVENT_WRITE
try:
self._selector.modify(key.fd, selector_events)
except Exception:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it isn't time to switch this (and other except clauses in this module) to BaseException. Otherwise a KeyboardInterrupt (in particular) could leave us in a bad state. This has been a recurring problem in asyncio; we eventually are planning to address it in asyncio, and then having it not be a problem here would be a requirement. (Though I'd be okay with filing a separate issue for this instead of folding it into this PR.)

Lib/selectors.py Outdated
super().unregister(fileobj)
raise
changed = True
if data is not key.data:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I think != is not generally slower and closer to the promised semantics.

@gvanrossum
Copy link
Member

I'm happy now. The BaseException thing could happen another time.

@vstinner vstinner merged commit fbfaa6f into master Jun 9, 2017
@vstinner
Copy link
Member

vstinner commented Jun 9, 2017

It's always possible to find something to enhance, but IMHO the change is now good enough and so I merged it.

Ok for != vs is not, Guido opinion matters more than mine since he is the coauthor with Charles-François Natali if I recall correctly.

@vstinner vstinner deleted the bpo-30014-selectors-modify-speedup branch June 9, 2017 20:23
@vstinner
Copy link
Member

vstinner commented Jun 9, 2017

Thanks Giampaolo for this nice opyimization! I tried to convince one or two times a few months ago, but you was motivated than me and succeeded to implement a nice change to factorize the code ;-)

@vstinner
Copy link
Member

vstinner commented Jun 9, 2017

I'm happy now. The BaseException thing could happen another time.

Yeah please, it deserves its own change. When an except block ends with "raise", I prefer "except:" or "except BaseException:".

@giampaolo
Copy link
Contributor Author

I'm happy now. The BaseException thing could happen another time.

Yeah please, it deserves its own change. When an except block ends with "raise", I prefer "except:" or "except BaseException:".

http://bugs.python.org/issue30624
#2082

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants