Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aarch64: build fails with TestOverlay128LayerRead #27384

Closed
vielmetti opened this issue Oct 14, 2016 · 13 comments · Fixed by #27520
Closed

aarch64: build fails with TestOverlay128LayerRead #27384

vielmetti opened this issue Oct 14, 2016 · 13 comments · Fixed by #27520
Labels
area/storage/overlay kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. platform/arm version/1.12
Milestone

Comments

@vielmetti
Copy link

Description

I'm trying to build master on ARMv8 (aarch64). The overlay2 test fails with "TestOverlay128LayerRead".

Steps to reproduce the issue:

  1. install Docker on Ubuntu for aarch64 with "apt-get install docker.io"
  2. download docker source and build with "make all"

Describe the results you received:

-- FAIL: TestOverlay128LayerRead (4.71s)
FAIL
FAIL    github.com/docker/docker/daemon/graphdriver/overlay2    5.506s

Describe the results you expected:

Expect all tests to pass.

Additional information you deem important (e.g. issue happens only occasionally):

Repeatable on successive builds.

Output of docker version:

Client:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.2
 Git commit:   23cf638
 Built:        Tue, 27 Sep 2016 12:25:38 +1300
 OS/Arch:      linux/arm64

Server:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.2
 Git commit:   23cf638
 Built:        Tue, 27 Sep 2016 12:25:38 +1300
 OS/Arch:      linux/arm64

Output of docker info:

Containers: 7
 Running: 0
 Paused: 0
 Stopped: 7
Images: 117
Server Version: 1.12.1
Storage Driver: overlay
 Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge overlay null host
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor
Kernel Version: 4.4.0-38-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: aarch64
CPUs: 96
Total Memory: 125.8 GiB
Name: armv8hello.local.lan
ID: QB72:5PAI:O62E:QEPU:4DKN:3M2K:33FC:AIJU:PXMC:YMGX:Y376:IAPK
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
 127.0.0.0/8

Additional environment details (AWS, VirtualBox, physical, etc.):

Hosted ARMv8 96 core server

@thaJeztah
Copy link
Member

/cc @dmcgowan

@dmcgowan
Copy link
Member

@vielmetti thanks for report, until I can get an aarch64 host to test on do you mind running this set of test suite on the host https://github.com/dmcgowan/dsdbench? They take a little longer to run than the other unit tests but would give a precise point at which layer it is failing.

@vielmetti
Copy link
Author

Happy to, @dmcgowan - I'll give it a look.

@vielmetti
Copy link
Author

@dmcgowan - noted a build issue here dmcgowan/dsdbench#1

@vielmetti
Copy link
Author

dsdbench output is here, @dmcgowan

https://gist.github.com/423b469d21da240608e4ff13bef047f0

one line in particular reads

    layer_test.go:170: check layer 61 failed: error creating overlay mount to /tmp/layer-test-136891895/overlay2/22f492a8ed4861be6e42833282f56d132cc0a8b79f6aa4e8dc8ea215f4a69c61/merged: no such file or directory

@vielmetti
Copy link
Author

The dsdbench tool also has a benchmark tool, whose output is at

https://gist.github.com/e7dafbd07faf4a20a8f829254810de2d

BenchmarkGet100BaseMount-96     --- FAIL: BenchmarkGet100BaseMount-96
        bench_test.go:110: Failed to create layer chain: error creating overlay mount to /tmp/layer-test-149274786/overlay2/5e2c43486c8f17fd3b62738ed78d14a27bee1cd9392d8e89ea4c89e4709b4bea/merged: no such file or directory
                failed to mount
                dsdbench.CreateLayer
                        /root/src/dsdbench/test_util.go:38
                dsdbench.CreateLayerChain
                        /root/src/dsdbench/test_util.go:72
                dsdbench.benchmarkGetBaseMountWithDepth
                        /root/src/dsdbench/bench_test.go:108
                dsdbench.BenchmarkGet100BaseMount
                        /root/src/dsdbench/bench_test.go:84
                testing.(*B).runN
                        /usr/lib/go-1.6/src/testing/benchmark.go:135
                testing.(*B).launch
                        /usr/lib/go-1.6/src/testing/benchmark.go:210
                runtime.goexit
                        /usr/lib/go-1.6/src/runtime/asm_arm64.s:975
                failed to create layer 62
                dsdbench.CreateLayerChain
                        /root/src/dsdbench/test_util.go:75
                dsdbench.benchmarkGetBaseMountWithDepth
                        /root/src/dsdbench/bench_test.go:108
                dsdbench.BenchmarkGet100BaseMount
                        /root/src/dsdbench/bench_test.go:84
                testing.(*B).runN
                        /usr/lib/go-1.6/src/testing/benchmark.go:135
                testing.(*B).launch
                        /usr/lib/go-1.6/src/testing/benchmark.go:210
                runtime.goexit

@dmcgowan
Copy link
Member

dmcgowan commented Oct 18, 2016

@vielmetti thanks for help accessing the machine. Looks like there is indeed a bug, not quite sure yet why it seems to only effect aarch64. Since it also appears to be a regression since 1.12 I will add to the 1.13 milestore.

My assessment...
Looking at dmesg it is probably a regression related to #25824.

[432914.711438] overlayfs: failed to resolve '/tmp/docker-graphtest-178900551/overlay2/f920d70d1eed1d9a556e715635a4c0d657d8d6d574981d411495be23b2b9ed82/wo': -2
[433060.496376] overlayfs: failed to resolve '/tmp/layer-test-062886040/overlay2/47d4de331993ae5a225d3fcfb': -2
[433746.589323] overlayfs: failed to resolve '/tmp/layer-test-687104329/overlay2/71cdd2bb45cd5e6b9de9eb7cf': -2
[433980.309895] overlayfs: failed to resolve '/tmp/layer-test-112160183/overlay2/3ee23d5a32180d4dbd69c6b52': -2
[434602.964220] overlayfs: failed to resolve '/tmp/layer-test-933667525/overlay2/dd4f155520cd1f4ba13171bc2': -2
[435246.963722] overlayfs: failed to resolve '/tmp/layer-test-827908427/overlay2/875209e0298cb554aa1861935': -2

Paths are getting truncated which I attribute to hitting the page boundary. I tested with 1.12.0 and did not hit this issue.

@dmcgowan dmcgowan added this to the 1.13.0 milestone Oct 18, 2016
@dmcgowan
Copy link
Member

dmcgowan commented Oct 18, 2016

This might be a bug in the aarch64 implementation of Go.

syscall.Getpagesize() is returning 65536, while getconf PAGESIZE returns 4096. The call fails at layer 61 in the test because at that point the mount arguments hit a length of 4139, which explains the truncation down to 4096 and the subsequent failure.

root@armv8hello:~/derek-test# cat main.go
package main

import (
    "fmt"
    "syscall"
)

func main() {
    fmt.Printf("Page size: %d\n", syscall.Getpagesize())
}
root@armv8hello:~/derek-test# go run main.go
Page size: 65536
root@armv8hello:~/derek-test# getconf PAGESIZE
4096

@dmcgowan
Copy link
Member

golang/go#13191

dmcgowan added a commit to dmcgowan/docker that referenced this issue Oct 19, 2016
Go can falsely report a larger page size than supported,
causing overlay2 mount arguments to be truncated. When overlay2
detects the mount arguments have hit the page limit, it will
switch to using relative paths. If this limit is smaller than
the actual page size there is no behavioral problems, but if it
is larger mounts can fail for images with many layers.

Closes moby#27384

Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
@thaJeztah thaJeztah added the kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. label Oct 19, 2016
@aclements
Copy link

Would it be possible to test with Go tip? At tip, syscall.Getpagesize() should return the same thing as getconf PAGESIZE.

@dmcgowan
Copy link
Member

@aclements are there any tip builds available for arm64?

@aclements
Copy link

@aclements are there any tip builds available for arm64?

If you mean binary packages, then, no. We only package releases. But it's pretty easy to build your own. On any machine that has a Go install (doesn't have to be arm64), run:

git clone https://go.googlesource.com/go
cd go/src
export GOROOT_BOOTSTRAP=$(go env GOROOT)
GOOS=linux GOARCH=arm64 ./bootstrap.bash

This will create a ../../go-linux-arm64-bootstrap.tbz that you can just unpack on the target machine and use. You may have to set GOROOT on the target to where you unpacked the tarball.

liusdu pushed a commit to liusdu/moby that referenced this issue Oct 30, 2017
Go can falsely report a larger page size than supported,
causing overlay2 mount arguments to be truncated. When overlay2
detects the mount arguments have hit the page limit, it will
switch to using relative paths. If this limit is smaller than
the actual page size there is no behavioral problems, but if it
is larger mounts can fail for images with many layers.

Closes moby#27384

cherry-pick from: moby#27520

Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Signed-off-by: Lei Jitang <leijitang@huawei.com>
(cherry picked from commit 520034e)
@vielmetti
Copy link
Author

If for whatever reason you need a minimal test case to replicate this, this Dockerfile

https://gist.github.com/anonymous/bdafb8e961f55b2533fee8fa5221d186

will create 100 layers and would fail to work at build time if you were using an unpatched / unfixed / old version of Docker on arm64. The error you'll expect is something like

Step 41 : RUN mkdir 40
error creating aufs mount to /var/lib/docker/aufs/mnt/787c80e88d99c4ed74305f16ce
30395e3346cfa4629c3d078f9fab3c6e4e52f0: invalid argument

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/storage/overlay kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. platform/arm version/1.12
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants