Could the default mount cache id include target architecture? #2598

couling · 2022-02-04T15:41:30Z

Problem

There's a "gotya" when working with cache mounts and building with multiple architectures. The stated use case for these is:

This mount type allows the build container to cache directories for compilers and package managers.

However both compilers and package managers are often architecture dependant. The default id for the cache is just target so the cache is, by default, shared between architectures. This can be damaging.

At best, the cache gets flushed and is useless every build with a different architecture.

At worst, the code using the cache can't detect the incorrect architecture and gets confused by the content.

At worst worst, the code using the cache can't detect the incorrect architecture and builds a corrupt image as a result.

It's fine to expect programs using the cache to detect the cache is stale. But it's extremely uncommon for such programs to detect the wrong architecture's cache has been swapped in.

Example Error

If I have a dockerfile:

FROM alpine:latest AS base
RUN --mount=type=cache,sharing=locked,target=/var/cache/apk \
    apk add python3 py3-pip py3-wheel

And then I build twice (with qemu installed):

docker build -t my_image:latest_arm64 --platform linux/arm64 .
docker build -t my_image:latest_x86_64 --platform linux/x86_64 .

I'll end up with errors caused by the cache:

WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.15/main: UNTRUSTED signature
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.15/community: UNTRUSTED signature
ERROR: unable to select packages:
py3-pip (no such package):
   required by: world[py3-pip]
py3-wheel (no such package):
   required by: world[py3-wheel]
python3 (no such package):
   required by: world[python3]
ERROR: executor failed running [/bin/sh -c apk add python3 py3-pip py3-wheel]: exit code: 3

Workaround

As a workaround you can include ${TARGETARCH} in the id. For example:

FROM alpine:latest AS base
ARG TARGETARCH
RUN --mount=type=cache,sharing=locked,id=${TARGETARCH}/var/cache/apk,target=/var/cache/apk \
    apk add python3 py3-pip py3-wheel

Since TARGETARCH is set by default The workaround only needs a change to the dockerfile and the build commands will then work.

Desired enhancement

Ideally this "workaround" should be the default behaviour: include the value from TARGETARCH in the default id. If developers want to share a cache between multiple architectures, the current behaviour would still be available by setting an id manually. But it means that by default the cache would "just work".

The worst case of not knowing about this behaviour would be slower builds and increased network usage on multi-arch builds for platform independent caches (java, javascript...).

The text was updated successfully, but these errors were encountered:

tonistiigi · 2022-02-04T20:43:11Z

I think the behavior depends on the use case. In a lot of cases same cache is desired for all platforms, eg. when downloading package source code that does not contain binaries it is usually identical. Also for general build cache other languages as go just understand that the cases when cache is specific to platform. If your case does not then I think ability to separate it via id is a good approach.

Regarding apk I don't think this is the way how you would do it and none of your packages are cached with this method.

I would do:

FROM alpine:latest AS base
RUN --mount=type=cache,sharing=locked,target=/etc/apk/cache \
   ls -l /etc/apk/cache && apk add --no-cache python3 py3-pip py3-wheel && ls -l /etc/apk/cache

That actually caches the packages that have been installed before and doesn't seem to have any requirements for TARGETARCH in id either. ls is just for debug so you see what is in cache before and after.

couling · 2022-02-04T21:05:53Z

As I say, this really about the safety of the defaults. I realise there's two use cases:

platform independent code - worst case poorer caching
platform dependant code - worst case failed builds or corrupted images

My reason for raising this request is that on balance I prefer default safety over default performance.

Alternatively a note about this in the documentation wouldn't go amiss. It took me an unfortunate amount of time to figure out what was going wrong.

Ultimately it's your call so I won't labor the point.

Regarding apk I don't think this is the way how you would do it and none of your packages are cached with this method.

The example I give is an SSCCE of what can go wrong. It's not a suggested way to cache PIP packages. It caches the package index and saves some performance loss from --no-cache. Its use case is a little bit lost in the given example.

tonistiigi · 2022-02-05T02:37:15Z

platform dependant code - worst case failed builds or corrupted images

A failed build isn't necessarily a worst-case in the dev phase but a hint for user that they forgot to set id. Not understanding that your build is inefficient although you think you did everything correctly might hurt more in a long run. In a lot of cases TARGETARCH is even completely wrong, eg. all our internal Dockerfiles are cross-compiling where separating cache by target doesn't make any sense.

Alternatively a note about this in the documentation wouldn't go amiss.

PR welcome.

It caches the package index and saves some performance loss from --no-cache

Iiuc it caches only the index, meaning if you change the command it will still download all packages again but they will always be the old versions. And I guess if the index gets old it will just fail to download packages because they don't exist anymore? A more useful pattern is to cache the packages (it's bit confusing that I use --no-cache but it still does that) so if command changes you always get the latest packages but the packages that were already downloaded once are not downloaded again.

ciaranmcnulty · 2022-05-02T20:07:35Z

Just as a side-note to this, I had a similar problem and tried to resolve it with id=apk-${TARGETPLATFORM} but it didn't look like it was being expanded - would allowing arg usage here help with similar issues?

See: moby/buildkit#2598

potiuk mentioned this issue Oct 23, 2023

feat(dockerfile): Add pip caching for faster build apache/airflow#35026

Merged

clemlesne added a commit to clemlesne/blue-agent that referenced this issue Oct 25, 2024

perf: Add shared build cache

608d7ca

See: moby/buildkit#2598

clemlesne added a commit to clemlesne/blue-agent that referenced this issue Oct 25, 2024

perf: Add shared build cache

e1619c8

See: moby/buildkit#2598

clemlesne added a commit to clemlesne/blue-agent that referenced this issue Oct 25, 2024

perf: Add shared build cache

0f5c1b0

See: moby/buildkit#2598

clemlesne added a commit to clemlesne/blue-agent that referenced this issue Oct 25, 2024

perf: Add shared build cache

f064e24

See: moby/buildkit#2598

clemlesne added a commit to clemlesne/blue-agent that referenced this issue Oct 25, 2024

perf: Add shared build cache

3d0879a

See: moby/buildkit#2598

clemlesne added a commit to clemlesne/blue-agent that referenced this issue Oct 25, 2024

perf: Add shared build cache

7809e13

See: moby/buildkit#2598

clemlesne added a commit to clemlesne/blue-agent that referenced this issue Oct 29, 2024

perf: Add shared build cache

f0a0592

See: moby/buildkit#2598

clemlesne added a commit to clemlesne/blue-agent that referenced this issue Oct 30, 2024

perf: Add shared build cache

2c3fc78

See: moby/buildkit#2598

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could the default mount cache id include target architecture? #2598

Could the default mount cache id include target architecture? #2598

couling commented Feb 4, 2022 •

edited

Loading

tonistiigi commented Feb 4, 2022

couling commented Feb 4, 2022 •

edited

Loading

tonistiigi commented Feb 5, 2022

ciaranmcnulty commented May 2, 2022

Could the default mount cache id include target architecture? #2598

Could the default mount cache id include target architecture? #2598

Comments

couling commented Feb 4, 2022 • edited Loading

Problem

Example Error

Workaround

Desired enhancement

tonistiigi commented Feb 4, 2022

couling commented Feb 4, 2022 • edited Loading

tonistiigi commented Feb 5, 2022

ciaranmcnulty commented May 2, 2022

couling commented Feb 4, 2022 •

edited

Loading

couling commented Feb 4, 2022 •

edited

Loading