-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build hangs in com.google.devtools.build.lib.remote.RemoteExecutionCache ensureInputsPresent #16445
Comments
Hello @dpoluyanov, Could you please provide minimal steps to reproduce the issue with sample code repo. Thanks! |
Probably, it's not as easy as it seems, the issue is sporadic, I'll try to reproduce it with a minimal subset of similar steps as we do, but not sure of success in reproduction. |
Probably related #16423. |
Fixes bazelbuild#16422. Closes bazelbuild#16423. Closes bazelbuild#16445. Closes bazelbuild#16464. PiperOrigin-RevId: 480896881 Change-Id: I33019dbe8a088410280759465100a512a0f61bc1
Description of the bug:
We've discovered some builds, hanging on
"Compiling Java headers"
for indefinite state in remote execution. Hacking around and passing some kill -3 commands to bazel in our runner I've discovered that there are two skyframe-evaluator threads sitting inblockingAwait()
in com.google.devtools.build.lib.remote.RemoteExecutionCache.ensureInputsPresent(RemoteExecutionCache.java:115) (I've attached a full thread dump below).It is always
Compiling Java headers
stuck, and always two skyframe-evaluator threads sitting on this line (e.g.skyframe-evaluator 357
andskyframe-evaluator 490
in attached thread dump.I still in doubt if it is caused by some infrastructure failure, by problem with remote executor (tested on buildbarn) or by some of tons of flags which we are using in our build configuration.
I've tried to remove those flags, to enable build profile, to disable build profile, still don't luck.
Couldn't insist that it is some kind of race condition in
RemoteExecutionCache
, but "looks like it is".We are using
6.0.0-pre.20220922.1
, and as for now could not downgrade to test if such behaviour present in lower versions.What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
No response
Which operating system are you running Bazel on?
linux
What is the output of
bazel info release
?release 6.0.0-pre.20220922.1
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?No response
Have you found anything relevant by searching the web?
There are no similar issues on these resources
Any other information, logs, or outputs that you want to share?
bazel-server.jvm.threaddump.txt
The text was updated successfully, but these errors were encountered: