Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce WebSocket frame parser complexity #8682

Merged
merged 4 commits into from
Aug 17, 2024
Merged

Reduce WebSocket frame parser complexity #8682

merged 4 commits into from
Aug 17, 2024

Conversation

bdraco
Copy link
Member

@bdraco bdraco commented Aug 11, 2024

What do these changes do?

Small improvement to reduce the WebSocket frame parser complexity

  • Use identity checks instead of equality checks for the enums.
  • The length changes that trigger a break have been turned into guards to avoid all the indent to make a code a bit more readable.
  • The duplicate code in read payload length has been removed to improve maintainability

Are there changes in behavior for the user?

no

Is it a substantial burden for the maintainers to support this?

code is a bit easier to follow/maintain.

@bdraco bdraco added backport-3.10 backport-3.11 Trigger automatic backporting to the 3.11 release branch by Patchback robot labels Aug 11, 2024
Copy link

codecov bot commented Aug 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.96%. Comparing base (4f41d05) to head (2df2d96).
Report is 1071 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8682      +/-   ##
==========================================
- Coverage   97.96%   97.96%   -0.01%     
==========================================
  Files         107      107              
  Lines       33860    33858       -2     
  Branches     3970     3970              
==========================================
- Hits        33170    33168       -2     
  Misses        513      513              
  Partials      177      177              
Flag Coverage Δ
CI-GHA 97.86% <98.24%> (-0.01%) ⬇️
OS-Linux 97.52% <98.24%> (-0.01%) ⬇️
OS-Windows 95.84% <98.24%> (-0.06%) ⬇️
OS-macOS 97.19% <98.24%> (-0.01%) ⬇️
Py-3.10.11 97.33% <98.24%> (-0.01%) ⬇️
Py-3.10.14 97.26% <98.24%> (-0.01%) ⬇️
Py-3.11.9 97.49% <98.24%> (-0.01%) ⬇️
Py-3.12.4 97.61% <98.24%> (-0.01%) ⬇️
Py-3.8.10 95.57% <98.24%> (+0.03%) ⬆️
Py-3.8.18 97.08% <98.24%> (+0.02%) ⬆️
Py-3.9.13 96.90% <98.24%> (-0.30%) ⬇️
Py-3.9.19 97.17% <98.24%> (+0.02%) ⬆️
Py-pypy7.3.16 96.76% <98.24%> (+0.02%) ⬆️
VM-macos 97.19% <98.24%> (-0.01%) ⬇️
VM-ubuntu 97.52% <98.24%> (-0.01%) ⬇️
VM-windows 95.84% <98.24%> (-0.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bdraco
Copy link
Member Author

bdraco commented Aug 11, 2024

running the profile shows the code is only 2-3% faster. Thats within the margin of error. The code is a lot easier to read now though.

I think we will have to cythonize like #2837 suggests to get more out of it

@bdraco
Copy link
Member Author

bdraco commented Aug 11, 2024

_feed_data looks like it has more opportunities to optimize and cleanup

@Dreamsorcerer
Copy link
Member

I think we will have to cythonize like #2837 suggests to get more out of it

I'd really like to try throwing such code through mypyc first and seeing how that performs.

@bdraco
Copy link
Member Author

bdraco commented Aug 11, 2024

Have you had good luck with mypyc in the past? Its been a while since I tried it and wasn't able to ever use it in production because of stability issues.

@Dreamsorcerer
Copy link
Member

Have you had good luck with mypyc in the past? Its been a while since I tried it and wasn't able to ever use it in production because of stability issues.

Think it's only been considered stable for a couple of years, and was missing AsyncIterator support until around then. mypy itself seems to be stable, and I've used it for a few scripts, and in https://github.com/ruundii/bthidhub without further issues.

@bdraco bdraco changed the title Improve performance of WebSocket frame parser Reduce WebSocket frame parser complexity Aug 12, 2024
@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Aug 12, 2024
@bdraco
Copy link
Member Author

bdraco commented Aug 12, 2024

Since the performance improvement is within the margin of error, I recharacterized this to be a code complexity reduction.

@bdraco bdraco marked this pull request as ready for review August 12, 2024 14:01
@Dreamsorcerer
Copy link
Member

I'd probably skip the changelog myself, seems like no user-visible changes.

@bdraco bdraco added bot:chronographer:skip This PR does not need to include a change note and removed bot:chronographer:provided There is a change note present in this PR labels Aug 12, 2024
@Dreamsorcerer
Copy link
Member

Have you had good luck with mypyc in the past? Its been a while since I tried it and wasn't able to ever use it in production because of stability issues.

Think it's only been considered stable for a couple of years, and was missing AsyncIterator support until around then. mypy itself seems to be stable, and I've used it for a few scripts, and in https://github.com/ruundii/bthidhub without further issues.

In short, even a couple of years ago, I didn't experience any stability issues personally, but didn't try it out on larger projects. The only issues were related to specific features, which were predictable. Things like AsyncIterator being missing resulted in some files not being able to compile, and there was atleast one case where the compiled code wouldn't work for some reason, along the lines of an inspection of a function needed to return that it was a function, but the compiled version is not considered a function by Python's inspect module, so was incompatible with the library code I was using.

@bdraco
Copy link
Member Author

bdraco commented Aug 12, 2024

Thanks for the additional color. I'll also ask for some recent feedback from Home Assistant devs to see if anyone has used it in a larger project.

@bdraco
Copy link
Member Author

bdraco commented Aug 12, 2024

Marc pointed out that black is using mypyc https://ichard26.github.io/blog/2022/05/compiling-black-with-mypyc-part-1/

There are probably some gotchas we might hit based on that post, but its still probably easier than re-writing it in cython, although cython might be a bit faster, and aiohttp already uses cython.

I'm on the fence on which direction to go with this.

@Dreamsorcerer
Copy link
Member

Yeah, I'd prefer to try mypyc first, as it's easier to maintain a Python file, rather than Cython + Python.
It's not obvious to me that Cython would be faster though. At some point, I'd like to try rewriting one of our Cython modules and compare performance with mypyc, though I imagine mypyc would only have a chance if it optimises out ctypes calls to llhttp into direct calls (not sure if it does or not).

@Dreamsorcerer
Copy link
Member

Also, I think the main gotchas are to do with interacting with the object, e.g. mocking, inspecting etc. The code for parsers should be fairly private, so we shouldn't expect much user code to be interacting with it.

@bdraco
Copy link
Member Author

bdraco commented Aug 12, 2024

It looks like it doesn't like the enums

% mypyc aiohttp/http_websocket.py
aiohttp/http_websocket.py:142: warning: Treating generator comprehension as list
aiohttp/http_websocket.py:163: warning: Treating generator comprehension as list
running build_ext
building 'aiohttp.http_websocket__mypyc' extension
clang -fno-strict-overflow -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -O3 -Wall -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX14.sdk -I/Users/bdraco/aiohttp/venv/lib/python3.12/site-packages/mypyc/lib-rt -Ibuild -I/Users/bdraco/aiohttp/venv/include -I/opt/homebrew/opt/python@3.12/Frameworks/Python.framework/Versions/3.12/include/python3.12 -c build/aiohttp/__native_http_websocket.c -o build/temp.macosx-14.0-arm64-cpython-312/build/aiohttp/__native_http_websocket.o -O3 -g1 -Werror -Wno-unused-function -Wno-unused-label -Wno-unreachable-code -Wno-unused-variable -Wno-unused-command-line-argument -Wno-unknown-warning-option -Wno-unused-but-set-variable -Wno-ignored-optimization-argument -Wno-cpp
build/aiohttp/__native_http_websocket.c:5182:16: error: use of undeclared identifier 'CPyStatic_WSParserState___READ_HEADER'
    cpy_r_r5 = CPyStatic_WSParserState___READ_HEADER;
               ^
build/aiohttp/__native_http_websocket.c:6131:17: error: use of undeclared identifier 'CPyStatic_WSMsgType___CLOSE'
    cpy_r_r31 = CPyStatic_WSMsgType___CLOSE;
                ^
build/aiohttp/__native_http_websocket.c:6274:17: error: use of undeclared identifier 'CPyStatic_WSCloseCode___PROTOCOL_ERROR'
    cpy_r_r68 = CPyStatic_WSCloseCode___PROTOCOL_ERROR;
                ^
build/aiohttp/__native_http_websocket.c:6371:17: error: use of undeclared identifier 'CPyStatic_WSCloseCode___INVALID_TEXT'
    cpy_r_r95 = CPyStatic_WSCloseCode___INVALID_TEXT;
                ^
build/aiohttp/__native_http_websocket.c:6424:18: error: use of undeclared identifier 'CPyStatic_WSMsgType___CLOSE'
    cpy_r_r104 = CPyStatic_WSMsgType___CLOSE;
                 ^
build/aiohttp/__native_http_websocket.c:6476:18: error: use of undeclared identifier 'CPyStatic_WSCloseCode___PROTOCOL_ERROR'
    cpy_r_r116 = CPyStatic_WSCloseCode___PROTOCOL_ERROR;
                 ^
build/aiohttp/__native_http_websocket.c:6601:18: error: use of undeclared identifier 'CPyStatic_WSMsgType___CLOSE'
    cpy_r_r159 = CPyStatic_WSMsgType___CLOSE;
                 ^
build/aiohttp/__native_http_websocket.c:6656:18: error: use of undeclared identifier 'CPyStatic_WSMsgType___PING'
    cpy_r_r175 = CPyStatic_WSMsgType___PING;
                 ^
build/aiohttp/__native_http_websocket.c:6691:18: error: use of undeclared identifier 'CPyStatic_WSMsgType___PING'
    cpy_r_r180 = CPyStatic_WSMsgType___PING;
                 ^
build/aiohttp/__native_http_websocket.c:6742:18: error: use of undeclared identifier 'CPyStatic_WSMsgType___PONG'
    cpy_r_r194 = CPyStatic_WSMsgType___PONG;
                 ^
build/aiohttp/__native_http_websocket.c:6777:18: error: use of undeclared identifier 'CPyStatic_WSMsgType___PONG'
    cpy_r_r199 = CPyStatic_WSMsgType___PONG;
                 ^
build/aiohttp/__native_http_websocket.c:6828:18: error: use of undeclared identifier 'CPyStatic_WSMsgType___TEXT'
    cpy_r_r213 = CPyStatic_WSMsgType___TEXT;
                 ^
build/aiohttp/__native_http_websocket.c:6862:18: error: use of undeclared identifier 'CPyStatic_WSMsgType___BINARY'
    cpy_r_r217 = CPyStatic_WSMsgType___BINARY;
                 ^
build/aiohttp/__native_http_websocket.c:6900:18: error: use of undeclared identifier 'CPyStatic_WSCloseCode___PROTOCOL_ERROR'
    cpy_r_r224 = CPyStatic_WSCloseCode___PROTOCOL_ERROR;
                 ^
build/aiohttp/__native_http_websocket.c:6978:18: error: use of undeclared identifier 'CPyStatic_WSMsgType___CONTINUATION'
    cpy_r_r245 = CPyStatic_WSMsgType___CONTINUATION;
                 ^
build/aiohttp/__native_http_websocket.c:7059:18: error: use of undeclared identifier 'CPyStatic_WSCloseCode___MESSAGE_TOO_BIG'
    cpy_r_r268 = CPyStatic_WSCloseCode___MESSAGE_TOO_BIG;
                 ^
build/aiohttp/__native_http_websocket.c:7134:18: error: use of undeclared identifier 'CPyStatic_WSMsgType___CONTINUATION'
    cpy_r_r287 = CPyStatic_WSMsgType___CONTINUATION;
                 ^
build/aiohttp/__native_http_websocket.c:7167:18: error: use of undeclared identifier 'CPyStatic_WSCloseCode___PROTOCOL_ERROR'
    cpy_r_r291 = CPyStatic_WSCloseCode___PROTOCOL_ERROR;
                 ^
build/aiohttp/__native_http_websocket.c:7221:18: error: use of undeclared identifier 'CPyStatic_WSMsgType___CONTINUATION'
    cpy_r_r304 = CPyStatic_WSMsgType___CONTINUATION;
                 ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
error: command '/usr/bin/clang' failed with exit code 1

@Dreamsorcerer
Copy link
Member

Well, that's a disappointing start. If you can isolate a single part that causes the error, then I'd file a bug at https://github.com/mypyc/mypyc/issues

Maybe for immediate testing to get an idea of performance, we can just work around it (e.g. replace enum with ints).

@bdraco
Copy link
Member Author

bdraco commented Aug 12, 2024

It looks like there is already an issue for the problem with IntEnum mypyc/mypyc#1022

@bdraco
Copy link
Member Author

bdraco commented Aug 12, 2024

Found one more back in 2020 mypyc/mypyc#721

It doesn't look like it will get solved any time soon.

@bdraco
Copy link
Member Author

bdraco commented Aug 12, 2024

mypyc is starting to feel like a dead end. I'm not sure its worth pursing. Maybe its best to move this forward as-is, and than I can look into optimizing _feed_data a bit since without resorting to mypyc or cython.

@Dreamsorcerer
Copy link
Member

Yeah, mypyc/cython is a tangent, not related to this PR.

@bdraco bdraco merged commit 490fca6 into master Aug 17, 2024
37 of 38 checks passed
@bdraco bdraco deleted the websocket_parser branch August 17, 2024 13:52
Copy link
Contributor

patchback bot commented Aug 17, 2024

Backport to 3.10: 💚 backport PR created

✅ Backport PR branch: patchback/backports/3.10/490fca61815680a0da99b92b72484c4a14779747/pr-8682

Backported as #8727

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

Copy link
Contributor

patchback bot commented Aug 17, 2024

Backport to 3.11: 💚 backport PR created

✅ Backport PR branch: patchback/backports/3.11/490fca61815680a0da99b92b72484c4a14779747/pr-8682

Backported as #8728

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

patchback bot pushed a commit that referenced this pull request Aug 17, 2024
bdraco added a commit that referenced this pull request Aug 17, 2024
…lexity (#8727)

Co-authored-by: J. Nick Koston <nick@koston.org>
bdraco added a commit that referenced this pull request Aug 17, 2024
…lexity (#8728)

Co-authored-by: J. Nick Koston <nick@koston.org>
@Dreamsorcerer
Copy link
Member

It looks like there is already an issue for the problem with IntEnum mypyc/mypyc#1022

This one was just closed, so might be worth another try when the next mypy release is out..

@bdraco
Copy link
Member Author

bdraco commented Nov 30, 2024

I'm relatively happy where we landed with this one as we don't have to maintain a separate pyx file for the WebSocketReader as the pxd turned out to be enough. Would be nice to give mypyc another shot for the next project that comes up that could benefit from it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-3.11 Trigger automatic backporting to the 3.11 release branch by Patchback robot bot:chronographer:skip This PR does not need to include a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants