Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-45697: Use PyObject_TypeCheck in type_call and binary_op1 #29392

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Objects/typeobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -1121,7 +1121,7 @@ type_call(PyTypeObject *type, PyObject *args, PyObject *kwds)

/* If the returned object is not an instance of type,
it won't be initialized. */
if (!PyType_IsSubtype(Py_TYPE(obj), type))
if (!PyObject_TypeCheck(obj, type))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a new type is relatively slow. I do not think that the effect of this nanooptimization is measureable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that this code path is not for creating new types (which is slow as you say), but for creating new instances (which is quite fast).

I ran some synthetic benchmarks with this change and without it (on an opt build with PGO and LTO), and the gain is measurable:

build from main (commit acc89db which is the base commit for this PR):

>for _ in {1..10}; do ./main-opt/python -m timeit -n 10000 -s 'class Spam: pass' -- 'ten_k_spams = [Spam() for _ in range(10000)]'; done
10000 loops, best of 5: 896 usec per loop
10000 loops, best of 5: 887 usec per loop
10000 loops, best of 5: 857 usec per loop
10000 loops, best of 5: 838 usec per loop
10000 loops, best of 5: 847 usec per loop
10000 loops, best of 5: 863 usec per loop
10000 loops, best of 5: 845 usec per loop
10000 loops, best of 5: 902 usec per loop
10000 loops, best of 5: 890 usec per loop
10000 loops, best of 5: 875 usec per loop

build with this change applied (commit 2362bf6):

>for _ in {1..10}; do ./test-opt/python -m timeit -n 10000 -s 'class Spam: pass' -- 'ten_k_spams = [Spam() for _ in range(10000)]'; done
10000 loops, best of 5: 833 usec per loop
10000 loops, best of 5: 885 usec per loop
10000 loops, best of 5: 845 usec per loop
10000 loops, best of 5: 838 usec per loop
10000 loops, best of 5: 833 usec per loop
10000 loops, best of 5: 827 usec per loop
10000 loops, best of 5: 858 usec per loop
10000 loops, best of 5: 811 usec per loop
10000 loops, best of 5: 843 usec per loop
10000 loops, best of 5: 845 usec per loop

also worth noting that the previous approach (GH-29380) was implemented on the Instagram Server production workload in Meta/Facebook/Instagram (choose your favorite) and resulted measurable perf gain of around 0.2%

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Thank you for correcting me. Yes, for creating instances it makes more sense.

return obj;

type = Py_TYPE(obj);
Expand Down