Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'debug' and 'debug-arm64' versions of distroless/base images fails on arm64 #657

Closed
docularxu opened this issue Dec 18, 2020 · 8 comments
Closed

Comments

@docularxu
Copy link

docularxu commented Dec 18, 2020

When running on Arm64 machines, this error was found. It exists not only in distroless/base 'debug' images, but also in distroless/cc 'debug' images. All exhibit the same error prints:

$ sudo docker run gcr.io/distroless/base:debug -c "echo hello"
standard_init_linux.go:211: exec user process caused "exec format error"

The same command, because it's multi-arch, when running on amd64 machines, can succeed. Expected behavior: (as on amd64)
$ sudo docker run gcr.io/distroless/base:debug -c "echo hello"
hello

@docularxu
Copy link
Author

Also, this error happens on distroless/base-debian10 and distroless/base-debian9 's 'debug' images when running on arm64 machines. All seem have the same root. Error messages are the same.

@chanseokoh
Copy link
Member

chanseokoh commented Dec 21, 2020

We are running Travis builds on arm64, and one of the tests executes busybox and checks the output. The test is passing on arm64 Travis:

//base:debug_arm64_debian10_test                                         PASSED in 7.2s
//base:debug_arm64_debian9_test                                          PASSED in 10.0s

Can you pull base-debian10:debug-arm64 (or gcr.io/distroless/base-debian10@sha256:cd6d10eb4ec54b362a14ea0c15387ee9a964fcc82a06de5afbcd80f8c400cf20) again and test?

Or, could it be that you are on armv7l instead of armv8l? We have :debug-arm64 and :debug-arm.

@docularxu
Copy link
Author

Hi, @chanseokoh we found the problem. The 'busybox' in your base:debug_arm64 image is in arm 32 bit format. That's the reason why it fails in our arm64 machines.

Some google search returns this explanation about running arm32 bit applications on arm64 hardware [1] [2] . Generally speaking, it requires kernel configurations. So, to make your image more compatible, I think it's better you build everything including busybox in arm64.

For your Travis arm64 test, if you can run, could you confirm that with 'file /busybox/busy/box'? Expected result is "ARM aarch64".

@chanseokoh
Copy link
Member

chanseokoh commented Jan 5, 2021

Distroless downloads the pre-compiled armv8l version of BusyBox from their download page.

distroless/WORKSPACE

Lines 415 to 416 in 13f7c56

sha256 = "141adb1b625a6f44c4b114f76b4387b4ea4f7ab802b88eb40e0d2f6adcccb1c3",
urls = ["https://busybox.net/downloads/binaries/1.31.0-defconfig-multiarch-musl/busybox-armv8l"],

I verified that the BusyBox binary on base-debian10:debug-arm64 matches the intended SHA. Then locally, I downloaded the same binary and ran file. It shows 32-bit.

$ curl -O https://busybox.net/downloads/binaries/1.31.0-defconfig-multiarch-musl/busybox-armv8l
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1121k  100 1121k    0     0   172k      0  0:00:06  0:00:06 --:--:--  235k
$ sha256sum busybox-armv8l 
141adb1b625a6f44c4b114f76b4387b4ea4f7ab802b88eb40e0d2f6adcccb1c3  busybox-armv8l
$ file busybox-armv8l 
busybox-armv8l: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, stripped

However, I remember in an unrelated instance (was an amd case, not arm) the BusyBox dev said that the binary does not need to be 64-bit.

Wrong. The binary does not need to be 64-bit. Code carefully uses off_t, not ints, for file sizes everywhere.

(But maybe unlike on amd, a 32-bit arm binary can be very different than a 64-bit arm binary?)

If you think this is an issue in the BusyBox pre-compiled binary, I suggest to file a bug against them. I'd also appreciate your update on the progress.

@dcanadillas
Copy link

I am having the same issue here. From my Tekton pipeline I realized that the image gcr.io/distroless/base@sha256:cfdc553400d41b47fd231b028403469811fcdbc0e69d66ea8030c5a0b5fbac2b needed to place scripts was giving me a format exec error in my Linux aarch64. This is my machine version:

ubuntu@minikube:~$ uname -ar
Linux minikube 5.4.0-96-generic #109-Ubuntu SMP Wed Jan 12 18:07:25 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux

So I tried to change to the last debug version gcr.io/distroless/base:debug-arm64, having the same issue. If I test in a terminal from a simple docker run I receive the same error:

ubuntu@minikube:~$ docker run --rm --name distroless-test gcr.io/distroless/base:debug-arm64 -c "echo Hello"
standard_init_linux.go:228: exec user process caused: exec format error

Then, I also realized that the busybox imaged used in

urls = ["https://busybox.net/downloads/binaries/1.31.0-defconfig-multiarch-musl/busybox-armv8l"],
is not working on my Ubuntu ARM64:

ubuntu@minikube:~$ curl -LO https://busybox.net/downloads/binaries/1.31.0-defconfig-multiarch-musl/busybox-armv8l
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1053k  100 1053k    0     0   592k      0  0:00:01  0:00:01 --:--:--  592k

ubuntu@minikube:~$ chmod 755 busybox-armv8l

ubuntu@minikube:~$ ./busybox-armv8l echo Hello
-bash: ./busybox-armv8l: cannot execute binary file: Exec format error

ubuntu@minikube:~$ readelf -h busybox-armv8l
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0x1d600
  Start of program headers:          52 (bytes into file)
  Start of section headers:          1077944 (bytes into file)
  Flags:                             0x5000400, Version5 EABI, hard-float ABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         4
  Size of section headers:           40 (bytes)
  Number of section headers:         14
  Section header string table index: 13

That is a 32b binary. This is not a right version for an aarch64. Seems to be a Busybox problem with the binaries.

@Sebmaster
Copy link
Contributor

Maybe the solution here would be to just use Debian's busybox package directly instead of upstream? Looks like it's available for essentially all architectures.

@loosebazooka
Copy link
Member

This should be fixed and available now, thanks to a community contribution from @MrMYHuang

@MrMYHuang
Copy link
Contributor

I confirm the latest distroless/base:debug works well on Ubuntu 20 arm64 on Raspberry Pi 4 and Ubuntu 21 arm64 on Parallels Desktop on Apple M1. This is the command I tested.

docker run -it gcr.io/distroless/base:debug

MrMYHuang added a commit to MrMYHuang/pipeline that referenced this issue Feb 16, 2022
…ss/base:debug used an arm32 busybox binary in its arm64 image. Which doesn't work on some arm64 machines, e.g., Ubuntu 21 arm64 on Parallel Desktop on Apple Silicon M1. It caused this error:

"
$ docker run -it gcr.io/distroless/base@sha256:cfdc553400d41b47fd231b028403469811fcdbc0e69d66ea8030c5a0b5fbac2b
standard_init_linux.go:228: exec user process caused: exec format error
"

This PR GoogleContainerTools/distroless#960 fixes this bug. Hence, update the distroless/base:debug used by Tekton Pipeline in this commit.
MrMYHuang added a commit to MrMYHuang/pipeline that referenced this issue Feb 17, 2022
As said in GoogleContainerTools/distroless#657, in the past, distroless/base:debug used an arm32 busybox binary in its arm64 image. Which doesn't work on some arm64 machines, e.g., Ubuntu 21 arm64 on Parallel Desktop on Apple Silicon M1. It caused this error:
"
$ docker run -it gcr.io/distroless/base@sha256:cfdc553400d41b47fd231b028403469811fcdbc0e69d66ea8030c5a0b5fbac2b
standard_init_linux.go:228: exec user process caused: exec format error
"

This PR GoogleContainerTools/distroless#960 fixes this bug. Hence, update the distroless/base:debug used by Tekton Pipeline in this commit.
tekton-robot pushed a commit to tektoncd/pipeline that referenced this issue Feb 17, 2022
As said in GoogleContainerTools/distroless#657, in the past, distroless/base:debug used an arm32 busybox binary in its arm64 image. Which doesn't work on some arm64 machines, e.g., Ubuntu 21 arm64 on Parallel Desktop on Apple Silicon M1. It caused this error:
"
$ docker run -it gcr.io/distroless/base@sha256:cfdc553400d41b47fd231b028403469811fcdbc0e69d66ea8030c5a0b5fbac2b
standard_init_linux.go:228: exec user process caused: exec format error
"

This PR GoogleContainerTools/distroless#960 fixes this bug. Hence, update the distroless/base:debug used by Tekton Pipeline in this commit.
pxp928 added a commit to pxp928/pipeline that referenced this issue Feb 17, 2022
* cleanup - ApplyContext parameters

Instead of passing around the entire resolvedTaskResources, which is not
necessary at this point, just pass the task name.

No functional changes expected.

* use podtemplate imagepullsecrets to resolve entrypoint

* Update write_test.go

Fixed a typo

* Fix links to Why Aren't PipelineResources in Beta?

Links to the "Why Aren't PipelineResources in Beta?" section in the docs
should have `aren-t` in the fragment instead of `arent`. This can be
confirmed by clicking the link icon beside the heading and checking the
browser address bar.

* Fix tekton_pipelines_controller_taskrun_count recount bug

Added before and after condition check to avoid taskrun metrics recount bug.

* debug is an alpha feature

Documenting that the debug feature is still alpha. The feature was
introduced in pipelines release 0.26 behind enable-api-fields flag.

* Consider osversion when determining platform uniqueness

Prior to this change, an image (such as `golang:1.17`) that provided two
images that shared the same OS+architecture+variant would be considered
invalid, even if they described two different images whose platforms
differed on, for example, osversion (used by Windows images).

This change relaxes our platform uniqueness logic to take this into
account, unblocking Linux users from running such images.

There's still an issue for Windows users however, since when they
attempt to run these images they'll fail to find the correct command
taking into account their osversion. Workarounds in this case include
specifying a single-platform image, or avoiding multi-platform images
that provide two Windows images differing only by osversion.

This also updates our selection logic to take into account slightly
malformed multi-platform images that specify two images with the same
OS+architecture[+variant], so long as the duplicate entries describe the
same image by digest (e.g., anchore/syft:v0.37.10)

* [TEP-0059] Scope `when` expressions to `Task` only

In [TEP-0007: Conditions Beta][tep-0007], we introduced `when`
expressions to guard execution of `Tasks` in `Pipelines`.
To align with `Conditions`, we set scope of `when` expressions
to the guarded `Task` and its dependent `Tasks`.

In [TEP-0059: Skipping Strategies][tep-0059], we proposed changing
the scope of `when` expressions to the guarded `Task` only. This
was implemented in tektoncd#4085.
We provided a feature flag, `scope-when-expressions-to-task`, to
support migration. It defaulted to `false` for 9 months per our
[Beta API compatibility policy][policy], meaning that we continued
to guard the `Task` and its dependent `Tasks`. In this change, we
flip the flag to `true` to guard the `Task` only by default.

[tep-0007]: https://github.com/tektoncd/community/blob/main/teps/0007-conditions-beta.md
[tep-0059]: https://github.com/tektoncd/community/blob/main/teps/0059-skipping-strategies.md
[policy]: https://github.com/tektoncd/pipeline/blob/main/api_compatibility_policy.md

* Update the `scope-when-expressions-to-task` feature flag docs

In tektoncd#4580, we changed the
flag default from "false" to "true". However, the documentation
above the flag was still describing what setting it to "true" would
do. In this change, we update the documentation to focus on the
non-default option that users can choose to set - "false". We also
add a reference to TEP-0059 and relevant docs for more details.

* Patch temp GOPATH hack script to handle nounset option

Prior to this commit the setup-temporary-gopath.sh used the GOPATH variable without
first checking that it was set. When `set -o nounset` is working this
causes the script to exit with an error.

This commit adds a variable wrapping $GOPATH and setting a default if
it's missing, which should work around the `nounset`.

* use helper functions - MarkResource*

Replace updating the conditions directly with the helper functions -
MarkResourceRunning and MarkRunning.

No functional change expected.

* Update the deprecations table

The tekton.dev/task label for ClusterTasks have been removed in
tektoncd#2533, but the table
has not been updated yet, so doing it in here.

Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>

* Remove deprecated flags home-env and working-dir

This change removes two flags:

- disable-home-env-overwrite
- disable-working-dir-overwrite

That two flags that were originally introduced with default to false
and the feature associated to them was deprecated.
Nine months later (as per policy), in Dec 2020, the default value was
switched to default true and the flags were deprecated. Nine months
later we are finally removing the flags.

Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>

* Fix for some arm64 machines.

As said in GoogleContainerTools/distroless#657, in the past, distroless/base:debug used an arm32 busybox binary in its arm64 image. Which doesn't work on some arm64 machines, e.g., Ubuntu 21 arm64 on Parallel Desktop on Apple Silicon M1. It caused this error:
"
$ docker run -it gcr.io/distroless/base@sha256:cfdc553400d41b47fd231b028403469811fcdbc0e69d66ea8030c5a0b5fbac2b
standard_init_linux.go:228: exec user process caused: exec format error
"

This PR GoogleContainerTools/distroless#960 fixes this bug. Hence, update the distroless/base:debug used by Tekton Pipeline in this commit.

* Add Step and Sidecar Overrides to TaskRun API

This commit adds TaskRunStepOverrides and TaskRunSidecarOverrides to TaskRun.Spec and
PipelineRun.Spec.PipelineTaskRunSpec, gated behind the "alpha" API flag.
This is part 1 of implementing TEP-0094: Configuring Resource Requirements at Runtime.
https://github.com/tektoncd/community/blob/main/teps/0094-configuring-resources-at-runtime.md

* WIP spire.

Signed-off-by: Dan Lorenc <dlorenc@google.com>

changed to use spiffe-csi

Add pod SPIFFE id annotation for workload registrar

Signed-off-by: Brandon Lum <lumjjb@gmail.com>

removed spire jwt

updated obtaining trust bundle

Added SPIFFE entry registration and SVID entrypointer backoff (#2)

* Added SPIFFE entry registration and SVID entrypointer backoff

Signed-off-by: Brandon Lum <lumjjb@gmail.com>

* Allow SPIRE configuration through opts

Signed-off-by: Brandon Lum <lumjjb@gmail.com>

* Add validation of SpireConfig

Signed-off-by: Brandon Lum <lumjjb@gmail.com>

* merged upstream

Signed-off-by: pxp928 <parth.psu@gmail.com>

* added manifest check

* [WIP] Add SPIRE docs (#4)

* merged upstream

* Add several features/optimizations for SPIRE (#3)

* Record pod latency before SPIRE entry creation

Signed-off-by: Brandon Lum <lumjjb@gmail.com>

* SPIRE client connection caching

Signed-off-by: Brandon Lum <lumjjb@gmail.com>

* Optimize spire entry creation

Signed-off-by: Brandon Lum <lumjjb@gmail.com>

* Add TTL for workload entry based on taskrun timeout

Signed-off-by: Brandon Lum <lumjjb@gmail.com>

* Add SPIRE non-falsification doc

Signed-off-by: Brandon Lum <lumjjb@gmail.com>

Co-authored-by: pxp928 <parth.psu@gmail.com>

* merged upstream

Signed-off-by: pxp928 <parth.psu@gmail.com>

Co-authored-by: pritidesai <pdesai@us.ibm.com>
Co-authored-by: Yongxuan Zhang <yongxuanzhang@google.com>
Co-authored-by: Anupama Baskar <anu.baskar@ibm.com>
Co-authored-by: Alan Greene <github.com@alangreene.net>
Co-authored-by: Khurram Baig <kbaig@redhat.com>
Co-authored-by: Jason Hall <jasonhall@redhat.com>
Co-authored-by: Jerop <jerop@google.com>
Co-authored-by: Scott <sbws@google.com>
Co-authored-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>
Co-authored-by: Meng-Yuan Huang <myh@live.com>
Co-authored-by: Lee Bernick <lee.a.bernick@gmail.com>
Co-authored-by: Dan Lorenc <dlorenc@google.com>
Co-authored-by: Brandon Lum <lumjjb@gmail.com>
khrm pushed a commit to openshift/tektoncd-pipeline that referenced this issue May 19, 2022
As said in GoogleContainerTools/distroless#657, in the past, distroless/base:debug used an arm32 busybox binary in its arm64 image. Which doesn't work on some arm64 machines, e.g., Ubuntu 21 arm64 on Parallel Desktop on Apple Silicon M1. It caused this error:
"
$ docker run -it gcr.io/distroless/base@sha256:cfdc553400d41b47fd231b028403469811fcdbc0e69d66ea8030c5a0b5fbac2b
standard_init_linux.go:228: exec user process caused: exec format error
"

This PR GoogleContainerTools/distroless#960 fixes this bug. Hence, update the distroless/base:debug used by Tekton Pipeline in this commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants