-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not reuse gRPC connections that fail with native Netty errors #23150
Conversation
Such connections are usually not in a recoverable state and should not be used for retries, which would otherwise likely fail in the same way.
var call = mock(ClientCall.class); | ||
doAnswer( | ||
invocationOnMock -> { | ||
((ClientCall.Listener) invocationOnMock.getArgument(0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more of a whitebox test than I would have liked, but I didn't find another way to simulate a connection failure. Open to suggestions here to make this more realistic.
private static boolean isFatalError(@Nullable Throwable t) { | ||
// A low-level netty error indicates that the connection is fundamentally broken | ||
// and should not be reused for retries. | ||
return t instanceof Errors.NativeIoException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could extend this to cover certain status codes, but I wanted to start with something limited enough to not introduce performance regressions.
This may also fix #10159 and some other related ssl issues. |
#18764 was another example. You're probably right, the detection here wouldn't see through the UNAVAILABLE to the handshake timeout - what I found was that the SSLException was sticky, and I couldn't manipulate the channel persistence management levels enough to get it reliably fixed or reproducing before I started diagnosing the underlying network issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! LGTM.
@bazel-io fork 7.4.0 |
FYI: |
can we cherry pick this into a release that's coming sooner than October? maybe a 7.3.2? |
Such connections are usually not in a recoverable state and should not be used for retries, which would otherwise likely fail in the same way. Fixes bazelbuild#20868 Closes bazelbuild#23150. PiperOrigin-RevId: 662091153 Change-Id: Iaf160b11a13af013b9969c7fdaa966bca8ab6be2
idk when the 7.4.0 RCs will be, that would be fine with me too, but currently we can't use HEAD commits because of bazelbuild/continuous-integration#1402 |
…ors (#23343) Such connections are usually not in a recoverable state and should not be used for retries, which would otherwise likely fail in the same way. Fixes #20868 Closes #23150. PiperOrigin-RevId: 662091153 Change-Id: Iaf160b11a13af013b9969c7fdaa966bca8ab6be2 Commit 06691b3 Co-authored-by: Fabian Meumertzheim <fabian@meumertzhe.im>
Such connections are usually not in a recoverable state and should not be used for retries, which would otherwise likely fail in the same way.
Fixes #20868