Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel sometimes execs file downloaded from remote cache before executable bits have been set #12137

Closed
Tracked by #12665
pcjanzen opened this issue Sep 18, 2020 · 2 comments
Assignees
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug

Comments

@pcjanzen
Copy link
Contributor

Description of the problem / feature request:

When using remote execution or remote caching and --remote_download_minimal, when a binary must be downloaded to the local machine from the remote, occasionally Bazel will attempt to execute the file before setting its executable bits.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

I haven't been able to generate a simplified repeatable example because it is sensitive to other things going on in the same build. But my in-house test case looks like:

genrule(
    name = "some_data",
    outs = ["some_data.txt"],
    cmd = ":> $@",
)

cc_test(
    name = "hello_world",
    srcs = ["hello_world.c"],
    data = [":some_data"],
)

sh_test(
    name = "chmod_bug",
    srcs = ["chmod_bug.sh"],
    data = [":hello_world"],
    tags = ["no-remote-exec"],
    args = ["$(location :hello_world)"],
)

That is, there's a binary that's built and cached in remote execution, but there's some other rule that's marked "no-remote" that requires that binary to be downloaded to the local host. With --remote_download_minimal, the binary is deleted after the local execution occurs, so the download/chmod/exec cycle happens every time the binary is required.

The shell script then just does:

stat -L $1
date --rfc-3339=ns -u
$1 || (sleep 1; stat -L $1; exit 1)

which occasionally leads to:

$ bazel test --remote_download_toplevel -t- //pcj:chmod_bug
==================== Test output for //pcj:chmod_bug:
  File: pcj/hello_world
  Size: 8600      	Blocks: 24         IO Block: 4096   regular file
Device: 802h/2050d	Inode: 25690430    Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1007/pauljanzen)   Gid: ( 1007/pauljanzen)
Access: 2020-09-18 22:24:05.989466229 +0000
Modify: 2020-09-18 22:24:05.989466229 +0000
Change: 2020-09-18 22:24:05.989466229 +0000
 Birth: -
2020-09-18 22:24:06.026526515+00:00
line 5: pcj/hello_world: Permission denied
  File: pcj/hello_world
  Size: 8600      	Blocks: 24         IO Block: 4096   regular file
Device: 802h/2050d	Inode: 25690430    Links: 1
Access: (0755/-rwxr-xr-x)  Uid: ( 1007/pauljanzen)   Gid: ( 1007/pauljanzen)
Access: 2020-09-18 22:24:05.989466229 +0000
Modify: 2020-09-18 22:24:05.989466229 +0000
Change: 2020-09-18 22:24:06.093465133 +0000
 Birth: -
================================================================================

I have to build some other moderately-sized, unrelated target at the same time in order to trigger the bug. It seems like it always succeeds the first time after starting the Bazel server or after any change that causes the analysis cache to be discarded, but will reliably fail within two or three iterations after that.

What operating system are you running Bazel on?

Ubuntu 18.04

What's the output of bazel info release?

release 3.4.1 (but have also verified that the problem is still present in release-3.6.0rc2 aa0d97c)

If bazel info release returns "development version" or "(@non-git)", tell us how you built Bazel.

Replace this line with your answer.

What's the output of git remote get-url origin ; git rev-parse master ; git rev-parse HEAD ?

Replace this line with your answer.

Have you found anything relevant by searching the web?

No.

Any other information, logs, or outputs that you want to share?

Replace these lines with your answer.

If the files are large, upload as attachment or provide link.

@oquenchil oquenchil added team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug untriaged labels Oct 5, 2020
@pcjanzen
Copy link
Contributor Author

We're also seeing this on MacOS hosts.

@coeuvre coeuvre added P2 We'll consider working on this in future. (Assignee optional) and removed untriaged labels Dec 9, 2020
@coeuvre coeuvre self-assigned this Dec 9, 2020
@coeuvre
Copy link
Member

coeuvre commented Nov 3, 2022

I believe this is fixed with recent developments on BwoB. Closing. Feel free to reopen if it still doesn't work for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug
Projects
None yet
Development

No branches or pull requests

3 participants