-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix HttpPayloadParser dealing with chunked response (#4630) #4801
Conversation
The version requirement `= 0` advertised in early development stages was an unfortunate choice, as this is going to be the opt-out once the new build config validation feature will be rolled out further. Please use `~> 1.0` instead. The conditions version `v1` is now the default, so this key can be removed.
Typo guaranty changed to correct spelling
I am going to do some extra works I found at the checklist. And then I will add that changes here. |
Codecov Report
@@ Coverage Diff @@
## master #4801 +/- ##
=======================================
Coverage 97.60% 97.60%
=======================================
Files 43 43
Lines 8932 8938 +6
Branches 1406 1408 +2
=======================================
+ Hits 8718 8724 +6
Misses 95 95
Partials 119 119
Continue to review full report at Codecov.
|
Hello @asvetlov Can you review my PR? And I am wondering why CI fails at test_close of test_web_protocol.py |
aiohttp/http_parser.py
Outdated
# end of stream | ||
self.payload.feed_eof() | ||
return True, chunk[2:] | ||
if len(chunk) >= 2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much simplier:
head = chunk[:2]
if not head:
return False, b''
if head == SEP[:1]:
self._chunk_tail = head
return False, b''
if head == SEP:
# end of stream
self.payload.feed_eof()
return True, chunk[2:]
self._chunk = ChunkState.PARSE_TRAILERS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @socketpair
I agree with you. So I changed the code simpler as you suggested.
But I moved the `if head == SEP' at the top because it is more likely to happen.
Thanks,
If the last CRLF or only the LF are received via separate TCP segment, HTTPPayloadParser misjudges that trailers should come after 0\r\n in the chunked response body. In this case, HttpPayloadParser starts waiting for trailers, but the only remaining data to be received is CRLF. Thus, HttpPayloadParser waits trailers indefinitely and this incurs TimeoutError in user code. However, if the connection is keep alive disabled, this problem is not reproduced because the server shutdown the connection explicitly after sending all data. If the connection is closed .feed_eof is called and it helps HttpPayloadParser finish its waiting.
Hello @webknjaz, |
Hello @asvetlov, @webknjaz and @socketpair I think this PR is forgotten. Please review this PR positively. Thanks, |
@rhdxmr please rebase from |
RESEND PR: #4846 |
HttpPayloadParser waits for trailers indefinitely even if there are no trailers
at the response. This happens when only the last CRLF or the last LF are sent
via separate TCP segment.
When the connection is keep alive and if this bug occurs then users experience
response timeout. But this problem is not exposed when keep alive is disabled
because .feed_eof is called. (Instead of TimeoutError, ClientPayloadError is raised if keep alive is disabled)
What do these changes do?
Fix a bug that HttpPayloadParser waits for data indefintely that will never come.
The bug makes caller of 'await response.read()' awaits forever or for timeout.
There are a few conditions which need to be met in order to reproduce this bug.
Are there changes in behavior for the user?
Improvement experience of users who are suffering from mysterous response timeout. There does not exist any log of which response time is slow in sever access log, but client writes log about response timeout.
Related issue number
#4630
I had a problem with intermittent response timeout so I conducted debug a lot.
And I managed to locate what caused the problem and fixed it on my own.
And then I was going to add test code that prove my modification really fix that thing. So I learned how to run test code in this project. And I ran test codes for the first time.
Surprisingly the issue 4630 just popped up with XPASS labeled. At this point I realized that the issue had been already reported 3 months ago. Sigh.. If the issue was fixed at that time, I would not spend my time for debugging this problem!
I did not mean to provide the fix on behalf of the person who reported the issue before me. I just did not know there was the issue before I started fixing the bug.
Checklist
CONTRIBUTORS.txt
CHANGES
folder<issue_id>.<type>
for example (588.bugfix)issue_id
change it to the pr id after creating the pr.feature
: Signifying a new feature..bugfix
: Signifying a bug fix..doc
: Signifying a documentation improvement..removal
: Signifying a deprecation or removal of public API..misc
: A ticket has been closed, but it is not of interest to users.