Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(oci_python): demonstrate fine grained layering #256

Merged
merged 2 commits into from
Sep 26, 2023

Conversation

thesayyn
Copy link
Member

@thesayyn thesayyn commented Aug 3, 2023

Type of change

  • New feature or functionality (change which adds functionality)

Test plan

  • Covered by existing test cases

@thesayyn thesayyn requested review from alexeagle and jbedard August 3, 2023 20:38
@alexeagle alexeagle changed the title feat: demonstrate fine grained layering feat(oci_python): demonstrate fine grained layering Aug 3, 2023
oci_python_image/py_image_layer.bzl Outdated Show resolved Hide resolved
oci_python_image/py_image_layer.bzl Outdated Show resolved Hide resolved
oci_python_image/py_image_layer.bzl Outdated Show resolved Hide resolved
@wskinner-dd
Copy link

Thanks for this example. I'm working on implementing something like this, and it has been a helpful inspiration. I noticed something in my own implementation, which I then realized is also present here: A change to any runfile of the py_binary will invalidate all layers and cause them to be rebuilt. This defeats much of the purpose of multi-layer images because those layer builds can be very slow.

Do you have any suggestions for how to fix this issue? I am new to Bazel and don't have a great idea of how to filter the runfiles in a way that avoids this total cache invalidation.

(venv) ddlabs@will-skinner:~/Projects/bazel-examples/oci_python_image$ bazel build //hello_world:hello_world_layer --explain=explain.txt --verbose_explanations && cat explain.txt 
INFO: Analyzed target //hello_world:hello_world_layer (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
INFO: Writing explanation of rebuilds to 'explain.txt'
Target //hello_world:hello_world_layer up-to-date:
  bazel-bin/hello_world/hello_world_layer/app.tar
  bazel-bin/hello_world/hello_world_layer/python.tar
INFO: Elapsed time: 0.184s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
Build options: --enable_bzlmod --explain=explain.txt --verbose_explanations
Executing action 'BazelWorkspaceStatusAction stable-status.txt': unconditional execution is requested.

Here, I changed __main__.py.

INFO: Analyzed target //hello_world:hello_world_layer (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
INFO: Writing explanation of rebuilds to 'explain.txt'
Target //hello_world:hello_world_layer up-to-date:
  bazel-bin/hello_world/hello_world_layer/app.tar
  bazel-bin/hello_world/hello_world_layer/python.tar
INFO: Elapsed time: 0.533s, Critical Path: 0.48s
INFO: 3 processes: 1 internal, 2 linux-sandbox.
INFO: Build completed successfully, 3 total actions
Build options: --enable_bzlmod --explain=explain.txt --verbose_explanations
Executing action 'BazelWorkspaceStatusAction stable-status.txt': unconditional execution is requested.
Executing action 'Writing: bazel-out/k8-fastbuild/bin/hello_world/hello_world_layer/app.tar': One of the files has changed.
Executing action 'Writing: bazel-out/k8-fastbuild/bin/hello_world/hello_world_layer/python.tar': One of the files has changed.

I verified that only the app layer changed by running the following after each build.

(venv) ddlabs@will-skinner:~/Projects/bazel-examples/oci_python_image$ find bazel-out/k8-fastbuild/bin/hello_world/hello_world_layer -type f -exec shasum {} \;

@wskinner-dd
Copy link

wskinner-dd commented Sep 16, 2023

Following up on my earlier comment - I found a solution. I don't fully understand why it works and would love for someone who understands Bazel better to explain it. My idea was that there must be some state being propagated through runfiles every time any file changes. So I added a new rule as a layer in between runfiles and pkg_tar.

FilesProvider = provider(
    fields = {
        "dest_src_map": "",
        "symlinks": "",
        "label": "",
    },
)

I modified _runfiles_impl to return that instead of PackageFilegroupInfo:

    return [
        FilesProvider(
            dest_src_map = file_map,
            label = ctx.label,
            symlinks = symlinks,
        ),
        DefaultInfo(
            files = files,
        ),
    ]

Define the new make_package_info rule:

def _make_package_info_impl(ctx):
    default = ctx.attr.runfiles[DefaultInfo]
    fp = ctx.attr.runfiles[FilesProvider]

    return [
        PackageFilegroupInfo(
            pkg_dirs = [],
            pkg_files = [
                [PackageFilesInfo(
                    dest_src_map = fp.dest_src_map,
                    attributes = {},
                ), fp.label],
            ],
            pkg_symlinks = fp.symlinks,
        ),
        DefaultInfo(files = default.files),
    ]

make_package_info = rule(
    implementation = _make_package_info_impl,
    attrs = {
        "runfiles": attr.label(
            mandatory = True,
            providers = [FilesProvider],
        ),
    },
)

And use it like this:

    runfiles(
        name = "%s/solib/runfiles" % name,
        include = "**/_solib_local/**/*",
        exclude = "**/*libtorch*/*",
        **runfiles_kwargs
    )

    make_package_info(
        name = "%s/solib/package_info" % name,
        runfiles = "%s/solib/runfiles" % name,
    )

    pkg_tar(
        name = "%s/solib" % name,
        srcs = ["%s/solib/package_info" % name],
        **pkg_tar_kwargs
    )

@thesayyn thesayyn force-pushed the create_separate_layer branch from f7edfad to 86b9bdb Compare September 25, 2023 21:21
@thesayyn
Copy link
Member Author

@wskinner-dd see the latest commit for fine-grained layering.

@thesayyn thesayyn force-pushed the create_separate_layer branch from 6dcf804 to 16b2ef1 Compare September 25, 2023 23:41
oci_python_image/MODULE.bazel Outdated Show resolved Hide resolved
@thesayyn thesayyn merged commit 474217a into main Sep 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants