-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HTTP/3] Fix NullReferenceException on cancellation #54334
Conversation
Tagging subscribers to this area: @dotnet/ncl Issue DetailsAdd abort on Content stream disposal, fix NRE, fix abort direction, add test. Fixes #48624
|
Tagging subscribers to this area: @dotnet/ncl Issue DetailsAdd abort on Content stream disposal, fix NRE, fix abort direction, add test. Fixes #48624
|
var stream = Interlocked.Exchange(ref _stream, null!); | ||
stream.Dispose(); | ||
DisposeSyncHelper(stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var stream = Interlocked.Exchange(ref _stream, null!); | |
stream.Dispose(); | |
DisposeSyncHelper(stream); | |
QuicStream stream = Interlocked.Exchange(ref _stream, null!); | |
if (stream is not null) | |
{ | |
stream.Dispose(); | |
DisposeSyncHelper(stream); | |
} |
@@ -81,8 +81,9 @@ public void Dispose() | |||
if (!_disposed) | |||
{ | |||
_disposed = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need _disposed, or does _stream being null now indicate disposal?
var stream = Interlocked.Exchange(ref _stream, null!); | ||
await stream.DisposeAsync().ConfigureAwait(false); | ||
DisposeSyncHelper(stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var stream = Interlocked.Exchange(ref _stream, null!); | |
await stream.DisposeAsync().ConfigureAwait(false); | |
DisposeSyncHelper(stream); | |
QuicStream stream = Interlocked.Exchange(ref _stream, null!); | |
if (stream is not null) | |
{ | |
await stream.DisposeAsync().ConfigureAwait(false); | |
DisposeSyncHelper(stream); | |
} |
_connection.RemoveStream(_stream); | ||
_connection = null!; | ||
_stream = null!; | ||
Interlocked.Exchange(ref _connection, null!).RemoveStream(stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we've got sole ownership of the stream now due to the Interlocked in the dispose methods above, do we still need this interlocked? Or is there another caller that could get here outside of Dispose{Async}?
Also, if it is still needed, then presumably there's a race condition around nulling this out, in which case the .RemoveStream
should be ?.RemoveStream
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've mostly done that to be sure the change is visible to null checks in Http3RequestStream.HandleReadResponseContentException
https://github.com/dotnet/runtime/pull/54334/files#diff-b79affd636899ac98b816c1398f9dca19bc4850a939a3815c09c5a9931094a4cR1113-R1120 In the issue I'm trying to solve, disposal and this exception handling are happening concurrently. Here #52800 (comment) @scalablecory says the change might not be visible otherwise. On the other hand, @wfurt offline told me that we usually don't worry about this for example for setting and checking _disposed
flag. I am not sure what should be the best practice here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest simply not nulling these fields out (i.e. _stream
and _connection
).
I'm not quite sure why you null them out today. Is it because you want to avoid doing stuff like Abort on the connection or AbortRead on the QuicStream if we are disposed? Because as the code stands, this is all racy and you aren't avoiding these calls consistently.
You need to either:
(a) Implement full locking here to prevent the races
(b) Avoid the races entirely by not modifying these. (There's still a race here, but it's in msquic code and they are presumably handling this today.)
@@ -1110,18 +1110,26 @@ private void HandleReadResponseContentException(Exception ex, CancellationToken | |||
throw new IOException(SR.net_http_client_execution_error, new HttpRequestException(SR.net_http_client_execution_error, abortException)); | |||
case Http3ConnectionException _: | |||
// A connection-level protocol error has occurred on our stream. | |||
_connection.Abort(ex); | |||
_connection?.Abort(ex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When can this happen? It seems weird that errors on a single HTTP stream would cause the whole connection to be killed.
throw new IOException(SR.net_http_client_execution_error, new HttpRequestException(SR.net_http_client_execution_error, ex)); | ||
} | ||
} | ||
|
||
private void CancelResponseContentRead() | ||
{ | ||
if (Volatile.Read(ref _responseDataPayloadRemaining) != -1) // -1 indicates EOS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an inherent race between setting this and checking it in ReadNextDataFrameAsync.
That means you need to handle the case where the AbortRead happens after we check this in ReadNextDataFrameAsync, but before we try to actually read the frame.
As such, I would suggest not even trying to set _responseDataPayloadRemaining
here. Just do AbortRead and let the next attempt to read from the underlying stream fail, and handle it as appropriate.
JFYI - I'll return to this PR after I'm back from vacation. Discussed with @karelz that this fix can wait. |
Closing in favor of #55724 |
Add abort on Content stream disposal, fix NRE, fix abort direction, add test.
Add volatile for EOS checking, update logging to include msquic pointers.
Fixes #48624