-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-45697: Use PyObject_TypeCheck in type_call and binary_op1 #29392
bpo-45697: Use PyObject_TypeCheck in type_call and binary_op1 #29392
Conversation
err, had a typo in the commit message with the issue number. happy to amend and push, though the devguide asks to avoid force pushing, so let me know what's the preferred way to edit that :-) |
@@ -1121,7 +1121,7 @@ type_call(PyTypeObject *type, PyObject *args, PyObject *kwds) | |||
|
|||
/* If the returned object is not an instance of type, | |||
it won't be initialized. */ | |||
if (!PyType_IsSubtype(Py_TYPE(obj), type)) | |||
if (!PyObject_TypeCheck(obj, type)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a new type is relatively slow. I do not think that the effect of this nanooptimization is measureable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that this code path is not for creating new types (which is slow as you say), but for creating new instances (which is quite fast).
I ran some synthetic benchmarks with this change and without it (on an opt build with PGO and LTO), and the gain is measurable:
build from main (commit acc89db which is the base commit for this PR):
>for _ in {1..10}; do ./main-opt/python -m timeit -n 10000 -s 'class Spam: pass' -- 'ten_k_spams = [Spam() for _ in range(10000)]'; done
10000 loops, best of 5: 896 usec per loop
10000 loops, best of 5: 887 usec per loop
10000 loops, best of 5: 857 usec per loop
10000 loops, best of 5: 838 usec per loop
10000 loops, best of 5: 847 usec per loop
10000 loops, best of 5: 863 usec per loop
10000 loops, best of 5: 845 usec per loop
10000 loops, best of 5: 902 usec per loop
10000 loops, best of 5: 890 usec per loop
10000 loops, best of 5: 875 usec per loop
build with this change applied (commit 2362bf6):
>for _ in {1..10}; do ./test-opt/python -m timeit -n 10000 -s 'class Spam: pass' -- 'ten_k_spams = [Spam() for _ in range(10000)]'; done
10000 loops, best of 5: 833 usec per loop
10000 loops, best of 5: 885 usec per loop
10000 loops, best of 5: 845 usec per loop
10000 loops, best of 5: 838 usec per loop
10000 loops, best of 5: 833 usec per loop
10000 loops, best of 5: 827 usec per loop
10000 loops, best of 5: 858 usec per loop
10000 loops, best of 5: 811 usec per loop
10000 loops, best of 5: 843 usec per loop
10000 loops, best of 5: 845 usec per loop
also worth noting that the previous approach (GH-29380) was implemented on the Instagram Server production workload in Meta/Facebook/Instagram (choose your favorite) and resulted measurable perf gain of around 0.2%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Thank you for correcting me. Yes, for creating instances it makes more sense.
@@ -1121,7 +1121,7 @@ type_call(PyTypeObject *type, PyObject *args, PyObject *kwds) | |||
|
|||
/* If the returned object is not an instance of type, | |||
it won't be initialized. */ | |||
if (!PyType_IsSubtype(Py_TYPE(obj), type)) | |||
if (!PyObject_TypeCheck(obj, type)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Thank you for correcting me. Yes, for creating instances it makes more sense.
Summary: In D14351113, the equality check in IsSubtype was introduced as an optimization for the hot path of instance creation. When upstreaming it ([bpo-45697](https://bugs.python.org/issue45697)) it was pointed out that it's preferable to use the `PyObject_TypeCheck` in the hot callsite (which performs an inline equality check before calling to IsSubtype) instead of adding a test to all paths (including those that already use `PyObject_TypeCheck`, hence would hit the equality check twice). This diff ports the [merged upstream PR](python/cpython#29392) to cinder. Reviewed By: carljm Differential Revision: D32175230 fbshipit-source-id: 93080c5
Summary: Upstream issue: python/cpython#89860 Upstream PR: python/cpython#29392 This also ports D14351113 + D32175230 (7b5bc58) from 3.8 to 3.10 Reviewed By: alexmalyshev Differential Revision: D42229775 fbshipit-source-id: 0bd735a
bpo-45697: Use PyObject_TypeCheck in type_call and binary_op1
Based on real world profiling data we collected, a vast amount of
PyType_IsSubtype
calls are coming fromtype_call
, when it decides whether__init__
should run or not.In the common case, the arguments to this call are identical, but the implementation still walks the MRO.
Using
PyObject_TypeCheck
avoids callingPyType_IsSubtype
in the common case.This PR supersedes GH-29380 based on discussion on the issue https://bugs.python.org/issue45697
https://bugs.python.org/issue45697