Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http2, tls: check content-length, fix RST and GOAWAY logic #53160

Open
wants to merge 52 commits into
base: main
Choose a base branch
from

Conversation

szmarczak
Copy link
Member

@szmarczak szmarczak commented May 26, 2024

Fixes #35209
Closes #35378

Time spent: 90h

Caution

These bug fixes are potentially semver-major!

  1. Fixed stream.close(NGHTTP2_CANCEL) closing with 0 aka NGHTTP2_NO_ERROR.
  2. serverStream.destroy() closed with 0, now NGHTTP2_INTERNAL_ERROR.
  3. clientStream.destroy() closed with 0, now NGHTTP2_CANCEL.
  4. Mismatched content-length now throws NGHTTP2_PROTOCOL_ERROR as according to the spec.
  5. Fixed GOAWAY (server -> client) being greeted with GOAWAY (client -> server) as it's against the spec.
  6. Fixed streams being always closed with NGHTTP2_CANCEL on socket close, now respects goawayCode and destroyCode. The default is NGHTTP2_INTERNAL_ERROR for premature TCP FIN and TCP RST.
  7. Fixed stream.rstCode being 0 while active - docs say it should be undefined.
  8. Fixed preferring sessionCode over rstCode when destroying a stream with an error.
  9. Fixed streams being closed with NO_ERROR if session received GOAWAY with NO_ERROR and remote closed connection.
  10. Fixed streams being closed with NO_ERROR if session sent GOAWAY with NO_ERROR and destroyed.
  11. Fixed GOAWAY preventing RST_STREAM from being sent before GOAWAY.
    nghttp2 correctly assumes that it should prevent RST_STREAM from being sent but incorretly applies it to all frames in a packet instead of frames defined after GOAWAY. This is not the only thing that nghttp2 does wrong:
  12. Fixed benchmark/http2/headers.js calling client.destroy() (resulting in dropped requests).
    Now calls client.close() which closes gracefully.
  13. connectStreamSocket.destroy() now closes with NGHTTP2_CONNECT_ERROR.
  14. Fixed double GOAWAY on session.close().
  15. Fixed queued RST_STREAM and GOAWAY being dropped if Http2Session::Close() and previous writes weren't finished yet.
  16. Fixed stream frame errors closing the entire session instead of just the stream.
  17. Fixed stream frame errors sending RST_STREAM on idle streams (to reproduce 16. needs to be fixed).
  18. Fixed resetAndDestroy can not be called on tlssocket? #49484
  19. Fixed server close not being emitted if TLS socket RSTs
  20. Fixed deadlock when sending GOAWAY before receiving server's GOAWAY. It won't gracefully close because the server may never ACK our GOAWAY. Now it sends TCP RST after terminating GOAWAY.

Sorry so many bugs are fixed in a single PR but it's impossible to fix one without fixing them all!

Best if nghttp2/nghttp2#1508 got merged before this, but it has been open for years 😢
Hence the patch for nghttp2_session.c


/cc @jasnell

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/http
  • @nodejs/http2
  • @nodejs/net
  • @nodejs/security-wg

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels May 26, 2024
Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow. congrats! It would really be better if the patch was landed, yes.

@mcollina
Copy link
Member

The linter and commit validator are failing, could you check?

@szmarczak

This comment was marked as resolved.

@mcollina mcollina added request-ci Add this label to start a Jenkins CI on a PR. baking-for-lts PRs that need to wait before landing in a LTS release. dont-land-on-v18.x PRs that should not land on the v18.x-staging branch and should not be released in v18.x. labels May 29, 2024
@mcollina mcollina requested a review from jasnell May 29, 2024 12:45
@mcollina
Copy link
Member

@jasnell could you reach out to the nghttp2 maintainers and ask about the status of that PR? I would sincerely hope we wouldn't have to maintain a floating patch.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Therefore the best change - although breaking - is to make it emit an error by default.

The reason why .destroy() does not emit an error by default is that it's a Stream API. .destroy() without error is supposed to be "safe" to call.

@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label May 29, 2024
@nodejs-github-bot
Copy link
Collaborator

Comment on lines 59 to 64
req.on('error', common.expectsError({
code: 'ERR_HTTP2_STREAM_ERROR',
name: 'Error',
message: 'Stream closed with error code NGHTTP2_PROTOCOL_ERROR'
}));

Copy link
Member Author

@szmarczak szmarczak May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be removed

Copy link
Member Author

@szmarczak szmarczak May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mcollina This is a result of the fix for making .destroy() not throw when sending RST. It throws only when receiving RST. This means that stream.close(1) won't emit error. Are you okay with this change? By looking at the original code for close() it looks like the author didn't mean to emit error on close().

Otherwise I would have to make stream.close() set stream[kState].rstCode and then call this.destroy(error).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually my understanding was wrong. It shouldn't be removed indeed.

It should error via onStreamClose (so pipeline fails etc.) but fails to do so because

this.destroy();

gets called first.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fluent in stream internals but setImmediate seems to work here (in afterShutdown), so I went with that.

Looks like the main cause is that writable shutdown is delayed by single tick in order to attach the END_STREAM flag. It's a pity nghttp2 has no API to amend the last queued DATA frame.

@szmarczak

This comment was marked as resolved.

@szmarczak

This comment was marked as outdated.

@mcollina
Copy link
Member

Have you tried pinging nghttp2?

@szmarczak

This comment was marked as resolved.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mcollina mcollina added the request-ci Add this label to start a Jenkins CI on a PR. label May 30, 2024
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label May 30, 2024
@nodejs-github-bot
Copy link
Collaborator

@szmarczak

This comment was marked as resolved.

@szmarczak

This comment was marked as off-topic.

@mcollina mcollina added the request-ci Add this label to start a Jenkins CI on a PR. label May 31, 2024
@szmarczak szmarczak changed the title fix(http2): check content-length, fix sending RST http2, tls: check content-length, fix RST and GOAWAY logic Oct 10, 2024
@szmarczak
Copy link
Member Author

@mcollina Can you trigger CI again? Thanks 🙏

The lint errors and test failures on GitHub Actions seem to be unrelated to this PR.

@mcollina mcollina added the request-ci Add this label to start a Jenkins CI on a PR. label Oct 12, 2024
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Oct 12, 2024
@nodejs-github-bot
Copy link
Collaborator

@@ -3326,7 +3324,7 @@ function socketOnClose() {

state.streams.forEach(closeStream);
state.pendingStreams.forEach(closeStream);
session.close();
Copy link
Member Author

@szmarczak szmarczak Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

session.close() (with this PR) calls session.destroy() if state.streams and state.pendingStreams are empty. Therefore it could obstruct session[kMaybeDestroy](err). Now it just marks the session as closed, hopefully fixing Windows: test-http2-server-sessionerror.js and test-http2-too-many-settings.js that do not create any streams.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed it fixed that!

@szmarczak
Copy link
Member Author

@mcollina Can you trigger CI again? Hopefully Windows passes this time 🙏🏼

@mcollina mcollina added the request-ci Add this label to start a Jenkins CI on a PR. label Oct 17, 2024
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Oct 17, 2024
@nodejs-github-bot
Copy link
Collaborator

@szmarczak
Copy link
Member Author

Windows passes now.

test-http2-large-file is flaky and I couldn't reproduce:

image

test-http-server-request-timeouts-mixed is also flaky.

test-http2-session-unref fails on aix. Why do we even support aix? I can't even work on this because it's so exclusive.

IMO this PR is good to go. I'd prefer if this landed first in v23, then backport to v22 if necessary.

@mcollina
Copy link
Member

test-http2-session-unref fails on aix. Why do we even support aix? I can't even work on this because it's so exclusive.

@mhdawson can you help with this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
baking-for-lts PRs that need to wait before landing in a LTS release. c++ Issues and PRs that require attention from people who are familiar with C++. dont-land-on-v18.x PRs that should not land on the v18.x-staging branch and should not be released in v18.x. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run.
Projects
None yet
4 participants