Add the experimental_use_remote_cache_for_cache_unaware_spawns flag #18944

jmmv · 2023-07-14T17:45:16Z

This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not.

The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems.

However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we already know about from the cache, but to rerun other local actions from scratch.

Unfortunately, the fact that "remote-cache" is not a strategy (see #18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them.

With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface.

This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface.

This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface. Upstream PR: bazelbuild#18944 Author: Julio Merino <julio.merino+oss@snowflake.com> Date: Fri Jul 14 10:32:41 2023 -0700 Description Testing

This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface. Upstream PR: bazelbuild#18944

This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface. Upstream PR: bazelbuild#18944 Author: Julio Merino <julio.merino+oss@snowflake.com> Date: Fri Jul 14 10:32:41 2023 -0700 Description Testing

jmmv force-pushed the allow-remote-cache branch 3 times, most recently from 5cb25b0 to 678f97c Compare July 14, 2023 18:00

jmmv force-pushed the allow-remote-cache branch from 678f97c to d4a461b Compare July 14, 2023 19:07

jmmv marked this pull request as ready for review July 14, 2023 21:43

jmmv requested a review from a team as a code owner July 14, 2023 21:43

github-actions bot added awaiting-review PR is awaiting review from an assigned reviewer team-Local-Exec Issues and PRs for the Execution (Local) team team-Remote-Exec Issues and PRs for the Execution (Remote) team labels Jul 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the experimental_use_remote_cache_for_cache_unaware_spawns flag #18944

Add the experimental_use_remote_cache_for_cache_unaware_spawns flag #18944

jmmv commented Jul 14, 2023

Add the experimental_use_remote_cache_for_cache_unaware_spawns flag #18944

Are you sure you want to change the base?

Add the experimental_use_remote_cache_for_cache_unaware_spawns flag #18944

Conversation

jmmv commented Jul 14, 2023