-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bazel throw io exception on outputs download connection reset #20868
Comments
I just realized that |
Yeah I wonder if there is something about
that is persistent between retries. The output that was being downloaded was the It seems like this will be a hard-to-reproduce case. @coeuvre What do you think about patching it so that we don't throw the entire Java stack trace to the end user, but instead provide them with some nice error message instructing them on what's the issue and/or how to investigate and fix this? |
I don't think Bazel even attempt to download zero bytes file. code.
Sounds good. PR is welcome! |
@coeuvre we have a user who is facing this class of issue again. The stack trace looks like this
The client-side remote grpc log shows that requests were retried multiple times, but they all failed with the same error:
Note that these are multiple retries of the same request, trying to download the same object from our cache. Luckily, this invocation ran for several minutes. We double-checked and our cache backend was handling plenty of other invocations at the same period without seeing a similar issue. That narrows down the root cause to be client-side. @coeuvre do you have a suggestion on how we could experiment/verify if that's the case? The issue has not happened consistently so it has been hard for us to investigate. |
Thanks for the analysis!
It sounds plausible because the connection is reference counted which implies we might reuse the same I would install a interceptor in the I will be OOO for next two weeks and can work on a patch afterwards. Or if it's urgent (or to catch up with the 7.3.0 release), please send us a PR. |
Such connections are usually not in a recoverable state and should not be used for retries, which would otherwise likely fail in the same way. Fixes bazelbuild#20868 Closes bazelbuild#23150. PiperOrigin-RevId: 662091153 Change-Id: Iaf160b11a13af013b9969c7fdaa966bca8ab6be2
…ors (#23343) Such connections are usually not in a recoverable state and should not be used for retries, which would otherwise likely fail in the same way. Fixes #20868 Closes #23150. PiperOrigin-RevId: 662091153 Change-Id: Iaf160b11a13af013b9969c7fdaa966bca8ab6be2 Commit 06691b3 Co-authored-by: Fabian Meumertzheim <fabian@meumertzhe.im>
A fix for this issue has been included in Bazel 7.4.0 RC1. Please test out the release candidate and report any issues as soon as possible. |
Description of the bug:
See #18337 (comment) for context
Which category does this issue belong to?
No response
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
On 6.3.2
Our customers also reported similar issue on 6.4.0, although no stack trace available due to
--verbose_failures
not being set.Which operating system are you running Bazel on?
Linux
What is the output of
bazel info release
?N/A
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.N/A
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.
No response
Have you found anything relevant by searching the web?
See #18337 (comment)
Any other information, logs, or outputs that you want to share?
cc: @coeuvre
The text was updated successfully, but these errors were encountered: