-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bazel_bootstrap_distfile_test failing in postsubmit on Windows #12578
Comments
It looks like before #14743, the job succeeds after retrying, but not after that.. |
The test is only failing in postsubmit, not in presubmit or downstream pipeline, I cannot reproduce this locally or on a Windows VM. The error message complained that //:default_host_platform doesn't exist, however, it should be defined in ./BUILD. Let's print out the content of ./BUILD to find out what's happening. Related: #12578 RELNOTES: None PiperOrigin-RevId: 344809001
I printed out the content of ./BUILD, it was somehow overridden to https://github.com/bazelbuild/bazel/blob/master/tools/jdk/BUILD.java_tools. This is super weird, I suspect this is caused by running another test in parallel, this could also explain why it's not failing in presubmit (because of sharding).. |
Related: #12578 RELNOTES: None PiperOrigin-RevId: 345191170
After hours of debugging.. I finally figured this out! So the culprit is d10013d, where it introduced a genrule:
And this is what's happening in the postsubmit pipeline on CI:
This isn't happening on Linux or macOS because of sandbox, the genrule cannot override any file in the execroot. To reproduce, simply run the following on Windows:
Lessen learned: we have to use genrule very carefully on Windows because the entire main repo source tree under execroot was exposed to the genrule command due to the lack of sandbox. /cc @comius |
OMG, Yun! Thank you so much for all the debugging and figuring this out. 🤯 Of course, now that I read your explanation, it all makes sense. Just from looking at the |
Thanks Yun for your effort. I didn't know I don't have a sandbox. Let me fix this issue properly, without putting some more perfume over a pig. |
Thank you, glad I can help! 😉 |
Related #12578 RELNOTES: None PiperOrigin-RevId: 346080243
The test
//src/test/shell/bazel:bazel_bootstrap_distfile_test
has started to fail in our postsubmit pipeline on Windows only.Here's a list of broken jobs:
I have not seen it happen before job 14736, so this might be the culprit. 🤔
I'm not aware of any CI infrastructure or Windows image changes in the last days.
Here's an example log: https://storage.googleapis.com/bazel-untrusted-buildkite-artifacts/745cad19-516a-479b-8084-9154bf14c893/src%5Ctest%5Cshell%5Cbazel%5Cbazel_bootstrap_distfile_test%5Cattempt_1.log
The relevant error message seems to be:
When the test fails during a job, it will consistently fail during all attempts, always with the same error message. It is thus not a typically flaky test which fails with a random chance.
The text was updated successfully, but these errors were encountered: