Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintain multiple analyses in analysis cache #14179

Open
irfansharif opened this issue Oct 27, 2021 · 12 comments
Open

Maintain multiple analyses in analysis cache #14179

irfansharif opened this issue Oct 27, 2021 · 12 comments
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Configurability platforms, toolchains, cquery, select(), config transitions type: feature request

Comments

@irfansharif
Copy link

irfansharif commented Oct 27, 2021

Description of the problem / feature request:

Bazel seems to maintain only one analysis in the analysis cache. Whenever a relevant flag is changed, we purge the cache in its entirety. This adds to build time even with a 100% cache hit rate for the build artifacts (see repro below). Workarounds I've seen so far suggest using a different output_base for each set of flags, which is pretty cumbersome and uses a ton of cache space. Envoy's gotten around by sharing the same set of flags for both build and test, but that's not always possible and is a stop gap for bazel preserving earlier analyses in the analysis cache for possible future re-use.

Feature requests: what underlying problem are you trying to solve with this feature?

It's a common workflow to switch between building and testing a bazel project. For us that entails using different --define and --test_env flags, which discards the analysis cache entirely. If bazel's maintaining only one copy of the analysis cache (from the last bazel invocation), we end up doing a lot of extra work to re-analyze the build despite 100% of the build artifacts being present in the remote cache. Maintaining multiple analyses in the analysis cache, each tagged with whatever compiler options they're safe to use with, would help us cut the iteration time down substantially. This also holds true for CI, where even if all the necessary artifacts are present, thrashing the analysis cache results in us doing a lot of unnecessary work.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

With https://github.com/cockroachdb/cockroach:

$ bazel build //pkg/cmd/cockroach-short:cockroach-short
INFO: Build options --define and --test_env have changed, discarding analysis cache.
INFO: Analyzed target //pkg/cmd/cockroach-short:cockroach-short (0 packages loaded, 18793 targets configured).
INFO: Found 1 target...
Target //pkg/cmd/cockroach-short:cockroach-short up-to-date:
  _bazel/bin/pkg/cmd/cockroach-short/cockroach-short_/cockroach-short
INFO: Elapsed time: 25.610s, Critical Path: 20.89s
INFO: 1995 processes: 1994 disk cache hit, 1 internal.
INFO: Build completed successfully, 1995 total actions
Successfully built binary for target //pkg/cmd/cockroach-short:cockroach-short at cockroach-short

$ bazel test //pkg/spanconfig/spanconfigkvsubscriber:spanconfigkvsubscriber_test --test_filter=TestDataDriven/basic --test_output errors
INFO: Build options --define and --test_env have changed, discarding analysis cache.
INFO: Analyzed target //pkg/spanconfig/spanconfigkvsubscriber:spanconfigkvsubscriber_test (0 packages loaded, 17487 targets configured).
INFO: Found 1 test target...
Target //pkg/spanconfig/spanconfigkvsubscriber:spanconfigkvsubscriber_test up-to-date:
  _bazel/bin/pkg/spanconfig/spanconfigkvsubscriber/spanconfigkvsubscriber_test_/spanconfigkvsubscriber_test
INFO: Elapsed time: 26.116s, Critical Path: 24.73s
INFO: 1995 processes: 1994 disk cache hit, 1 internal.
INFO: Build completed successfully, 1995 total actions
//pkg/spanconfig/spanconfigkvsubscriber:spanconfigkvsubscriber_test (cached) PASSED in 1.8s

Executed 0 out of 1 test: 1 test passes.
INFO: Build completed successfully, 1995 total actions

$ bazel build //pkg/cmd/cockroach-short:cockroach-short
WARNING: Ignoring JAVA_HOME, because it must point to a JDK, not a JRE.
INFO: Invocation ID: 7822bf1c-b690-4138-bdf9-45a4b1eed074
INFO: Build options --define and --test_env have changed, discarding analysis cache.
INFO: Analyzed target //pkg/cmd/cockroach-short:cockroach-short (0 packages loaded, 18793 targets configured).
INFO: Found 1 target...
Target //pkg/cmd/cockroach-short:cockroach-short up-to-date:
  _bazel/bin/pkg/cmd/cockroach-short/cockroach-short_/cockroach-short
INFO: Elapsed time: 25.610s, Critical Path: 20.89s
INFO: 1995 processes: 1994 disk cache hit, 1 internal.
INFO: Build completed successfully, 1995 total actions
Successfully built binary for target //pkg/cmd/cockroach-short:cockroach-short at cockroach-short

Observe that when switching between bazel build and bazel test, despite have a 100% disk cache hit, it takes ~25s for the execution to complete as a result of not having an earlier run's analysis cache data to consult. Compare it to when it is available:

$ bazel build //pkg/cmd/cockroach-short:cockroach-short --profile=build.gz
INFO: Analyzed target //pkg/cmd/cockroach-short:cockroach-short (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //pkg/cmd/cockroach-short:cockroach-short up-to-date:
  _bazel/bin/pkg/cmd/cockroach-short/cockroach-short_/cockroach-short
INFO: Elapsed time: 1.259s, Critical Path: 0.83s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
Successfully built binary for target //pkg/cmd/cockroach-short:cockroach-short at cockroach-short

What operating system are you running Bazel on?

MacOS Big Sur.

What's the output of bazel info release?

release 6.0.0-pre.20211019.1, though the same problem exists with 4.0.0 onwards.

Have you found anything relevant by searching the web?

Some other threads on the mailing list: https://groups.google.com/g/bazel-discuss/c/EdBvMEPrH5A and https://groups.google.com/g/bazel-discuss/c/vgn9uyyIrIM/m/4trNkRS0AQAJ.
Github: #11194, #12113 (comment).
Stack Overflow: https://stackoverflow.com/questions/53012722/why-does-bazel-do-a-full-rebuild-whenever-switching-between-intellij-and-command

@moroten
Copy link
Contributor

moroten commented Oct 27, 2021

@irfansharif Unfortunately, I don't have any code left from my experiments of not deleting the analysis cache. I threw it away as it didn't work.

@oquenchil oquenchil added team-Local-Exec Issues and PRs for the Execution (Local) team type: feature request untriaged labels Nov 2, 2021
@meisterT meisterT added team-Configurability platforms, toolchains, cquery, select(), config transitions and removed team-Local-Exec Issues and PRs for the Execution (Local) team labels Nov 9, 2021
@gregestren
Copy link
Contributor

I want to signal boost https://groups.google.com/g/bazel-discuss/c/vgn9uyyIrIM/m/4trNkRS0AQAJ - @moroten 's discussion listed above

The biggest challenge is addressing correctness concerns.

Partial approaches could be viable, with trim_test_configuration as precedent.

@gregestren gregestren added P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed untriaged labels Nov 16, 2021
@github-actions
Copy link

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

@github-actions github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label May 25, 2023
@github-actions
Copy link

github-actions bot commented Jun 8, 2023

This issue has been automatically closed due to inactivity. If you're still interested in pursuing this, please reach out to the triage team (@bazelbuild/triage). Thanks!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 8, 2023
@matts1
Copy link
Contributor

matts1 commented Jun 8, 2023

FYI, a decent workaround is to use the --output_base flag. It's not ideal, but if you're switching between configs, you can also switch between output bases and have a whole seperate cache.

@pauldraper
Copy link
Contributor

pauldraper commented Oct 24, 2024

To clarify @matts1 workaround: changing --output_base starts a new instance of Baze. So you have one Bazel server for each analysis configuration you wanted to keep.

@matts1
Copy link
Contributor

matts1 commented Oct 24, 2024

That's correct, yeah

@gregestren
Copy link
Contributor

We can see two improvements to this:

  • Depending on which flags you change, we think we can easily avoid redoing the exec-configured part of the cache, which can be half the cache for many builds. --define would probably fit this. I think --test_env too although Blaze has some special injection logic for that. @katre and @susinmotion are looking at that.

  • @jin is working on cross-build analysis caching that would make swapping in/out the analysis cache much faster, even if invalidation still occurs.

@tpudlik
Copy link
Contributor

tpudlik commented Dec 2, 2024

Could we reopen this issue, since it remains relevant and is being actively discussed?

@brentleyjones brentleyjones reopened this Dec 2, 2024
@github-actions github-actions bot removed the stale Issues or PRs that are stale (no activity for 30 days) label Dec 3, 2024
@jin
Copy link
Member

jin commented Dec 4, 2024

@jin is working on cross-build analysis caching that would make swapping in/out the analysis cache much faster, even if invalidation still occurs.

Yes: https://www.youtube.com/watch?v=op4gIYxucjE is a BazelCon 2024 about the cross-build analysis caching.

We are still working on the internal reference implementation of this, and will prioritize Bazel support after that.

@gregestren
Copy link
Contributor

Since this issue is specifically about --define and --test_env, two more things:

@gregestren
Copy link
Contributor

Just to be clear, who still cares about this beside @pauldraper , @tpudlik , and @matts1 ? These comments span over a few years.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Configurability platforms, toolchains, cquery, select(), config transitions type: feature request
Projects
None yet
Development

No branches or pull requests

11 participants