Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IOError: [Errno 12] Cannot allocate memory #520

Closed
Globegitter opened this issue Sep 13, 2018 · 8 comments
Closed

IOError: [Errno 12] Cannot allocate memory #520

Globegitter opened this issue Sep 13, 2018 · 8 comments
Labels
Can Close? Will close in 30 days unless there is a comment indicating why not

Comments

@Globegitter
Copy link
Contributor

Just seen this error in our ci:

(09:08:49) ERROR: /home/circleci/project/indexpage/BUILD.bazel:114:1: JoinLayers indexpage/intermediate_bundle_structure_test.tar failed (Exit 1): join_layers failed: error executing command 
  (cd /home/circleci/.cache/bazel/_bazel_circleci/f16e36219ef33c22efc2ad20f3e3775c/execroot/__main__ && \
  exec env - \
  bazel-out/host/bin/external/io_bazel_rules_docker/container/join_layers '--output=bazel-out/k8-fastbuild/bin/indexpage/intermediate_bundle_structure_test.tar' '--tags=indexpagenuxt-image:intermediate=@bazel-out/k8-fastbuild/bin/indexpage/nuxt-image.0.config' '--layer=@bazel-out/k8-fastbuild/bin/external/nodejs_image_base/image/000.tar.gz.nogz.sha256=@bazel-out/k8-fastbuild/bin/external/nodejs_image_base/image/000.tar.gz.sha256=bazel-out/k8-fastbuild/bin/external/nodejs_image_base/image/000.tar.gz.nogz=external/nodejs_image_base/image/000.tar.gz' '--layer=@bazel-out/k8-fastbuild/bin/indexpage/nuxt-image.0-layer.tar.sha256=@bazel-out/k8-fastbuild/bin/indexpage/nuxt-image.0-layer.tar.gz.sha256=bazel-out/k8-fastbuild/bin/indexpage/nuxt-image.0-layer.tar=bazel-out/k8-fastbuild/bin/indexpage/nuxt-image.0-layer.tar.gz' '--layer=@bazel-out/k8-fastbuild/bin/indexpage/nuxt-image.1-layer.tar.sha256=@bazel-out/k8-fastbuild/bin/indexpage/nuxt-image.1-layer.tar.gz.sha256=bazel-out/k8-fastbuild/bin/indexpage/nuxt-image.1-layer.tar=bazel-out/k8-fastbuild/bin/indexpage/nuxt-image.1-layer.tar.gz' '--layer=@bazel-out/k8-fastbuild/bin/indexpage/nuxt-image-layer.tar.sha256=@bazel-out/k8-fastbuild/bin/indexpage/nuxt-image-layer.tar.gz.sha256=bazel-out/k8-fastbuild/bin/indexpage/nuxt-image-layer.tar=bazel-out/k8-fastbuild/bin/indexpage/nuxt-image-layer.tar.gz')

Use --sandbox_debug to see verbose messages from the sandbox

Traceback (most recent call last):
  File "/home/circleci/.cache/bazel/_bazel_circleci/f16e36219ef33c22efc2ad20f3e3775c/sandbox/processwrapper-sandbox/1/execroot/__main__/bazel-out/host/bin/external/io_bazel_rules_docker/container/join_layers.runfiles/io_bazel_rules_docker/container/join_layers.py", line 222, in <module>
    main()
  File "/home/circleci/.cache/bazel/_bazel_circleci/f16e36219ef33c22efc2ad20f3e3775c/sandbox/processwrapper-sandbox/1/execroot/__main__/bazel-out/host/bin/external/io_bazel_rules_docker/container/join_layers.runfiles/io_bazel_rules_docker/container/join_layers.py", line 218, in main
    blobsum_to_unzipped, blobsum_to_zipped, blobsum_to_legacy)
  File "/home/circleci/.cache/bazel/_bazel_circleci/f16e36219ef33c22efc2ad20f3e3775c/sandbox/processwrapper-sandbox/1/execroot/__main__/bazel-out/host/bin/external/io_bazel_rules_docker/container/join_layers.runfiles/io_bazel_rules_docker/container/join_layers.py", line 154, in create_bundle
    v2_2_save.multi_image_tarball(tag_to_image, tar)
  File "/home/circleci/.cache/bazel/_bazel_circleci/f16e36219ef33c22efc2ad20f3e3775c/sandbox/processwrapper-sandbox/1/execroot/__main__/bazel-out/host/bin/external/io_bazel_rules_docker/container/join_layers.runfiles/containerregistry/client/v2_2/save_.py", line 107, in multi_image_tarball
    v1_save.multi_image_tarball(tag_to_v1_image, tar)
  File "/home/circleci/.cache/bazel/_bazel_circleci/f16e36219ef33c22efc2ad20f3e3775c/sandbox/processwrapper-sandbox/1/execroot/__main__/bazel-out/host/bin/external/io_bazel_rules_docker/container/join_layers.runfiles/containerregistry/client/v1/save_.py", line 73, in multi_image_tarball
    add_file(layer_id + '/layer.tar', content)
  File "/home/circleci/.cache/bazel/_bazel_circleci/f16e36219ef33c22efc2ad20f3e3775c/sandbox/processwrapper-sandbox/1/execroot/__main__/bazel-out/host/bin/external/io_bazel_rules_docker/container/join_layers.runfiles/containerregistry/client/v1/save_.py", line 45, in add_file
    tar.addfile(tarinfo=info, fileobj=io.BytesIO(contents))
  File "/usr/lib/python2.7/tarfile.py", line 2054, in addfile
    copyfileobj(fileobj, self.fileobj, tarinfo.size)
  File "/usr/lib/python2.7/tarfile.py", line 275, in copyfileobj
    dst.write(buf)
IOError: [Errno 12] Cannot allocate memory

This is the first time we are seeing this error after having used this setup for a few weeks now without any issues.

About our setup:
We are running in circleci within a docker container. I wonder if it means that we ran out of memory within the build container? We have seen in the past times where bazel got killed due to using more memory than the build container was limited to and since setting startup --host_jvm_args=-Xmx3G as well as build:ci --local_resources=3072,2.0,1.0 the problem was resolved. Not sure if that is in any way related and if there is anything that can be done here in these rules, but thought it is worth posting, even if it is just other people running into this issue.

@nlopezgi
Copy link
Contributor

Thanks for reporting this issue!
I've never seen that kind of error when running rules_docker (but I don't often run the rules inside a container directly). Our CI does run some tests inside a container and we have not seen this problem (but they are not doing anything too 'big'). However, I think I have seen this error before in a different context; when I was doing some tests with bazel's local docker sanbdoxing feature (https://docs.bazel.build/versions/master/remote-execution-sandbox.html#troubleshooting-in-a-docker-container). In those cases the error was related to having a container that was indeed trying to do way more than it should (create too many large files and/or launch too many processes, or exhausting some other resource - was not able to fully debug which happened first).

I do think that probably rules_docker could be pushed to the point where it exhaust resources if it was ran inside a container, but not sure what you would need to do to get there (maybe if you could describe a bit the type of image you are trying to build inside the container?)
In any case if you have any suggestions as to how we can reproduce to see if its something that would occur commonly and should be addressed soon, or if its a corner case that we can help users get around somehow (e.g., by fiddling around with the flags used to launch the container from which the rules are executed).

@enriched
Copy link

I am running into errors in a similar vein in our CI:

ERROR: /root/services-core/services/device-ops/server/BUILD.bazel:15:1: ImageLayer services/device-ops/server/image-layer.tar failed (Exit 1): build_tar failed: error executing command 
  (cd /workspace/.bazel/execroot/services_core && \
  exec env - \
    PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/root/bin \
  bazel-out/host/bin/external/io_bazel_rules_docker/container/build_tar '--flagfile=bazel-out/k8-fastbuild/bin/services/device-ops/server/image-layer.args')

Use --sandbox_debug to see verbose messages from the sandbox
Traceback (most recent call last):
  File "/workspace/.bazel/sandbox/processwrapper-sandbox/76/execroot/services_core/bazel-out/host/bin/external/io_bazel_rules_docker/container/build_tar.runfiles/io_bazel_rules_docker/container/build_tar.py", line 423, in <module>
    main(FLAGS(sys.argv))
  File "/workspace/.bazel/sandbox/processwrapper-sandbox/76/execroot/services_core/bazel-out/host/bin/external/io_bazel_rules_docker/container/build_tar.runfiles/io_bazel_rules_docker/container/build_tar.py", line 406, in main
    output.add_file(inf, tof, **file_attributes(tof))
  File "/workspace/.bazel/sandbox/processwrapper-sandbox/76/execroot/services_core/bazel-out/host/bin/external/io_bazel_rules_docker/container/build_tar.runfiles/io_bazel_rules_docker/container/build_tar.py", line 159, in add_file
    gname=names[1])
  File "/workspace/.bazel/sandbox/processwrapper-sandbox/76/execroot/services_core/bazel-out/host/bin/external/io_bazel_rules_docker/container/build_tar.runfiles/bazel_source/tools/build_defs/pkg/archive.py", line 242, in add_file
    self.add_dir(name, file_content, uid, gid, uname, gname, mtime, mode)
  File "/workspace/.bazel/sandbox/processwrapper-sandbox/76/execroot/services_core/bazel-out/host/bin/external/io_bazel_rules_docker/container/build_tar.runfiles/bazel_source/tools/build_defs/pkg/archive.py", line 188, in add_dir
    depth - 1)
  File "/workspace/.bazel/sandbox/processwrapper-sandbox/76/execroot/services_core/bazel-out/host/bin/external/io_bazel_rules_docker/container/build_tar.runfiles/bazel_source/tools/build_defs/pkg/archive.py", line 188, in add_dir
    depth - 1)
  File "/workspace/.bazel/sandbox/processwrapper-sandbox/76/execroot/services_core/bazel-out/host/bin/external/io_bazel_rules_docker/container/build_tar.runfiles/bazel_source/tools/build_defs/pkg/archive.py", line 188, in add_dir
    depth - 1)
  File "/workspace/.bazel/sandbox/processwrapper-sandbox/76/execroot/services_core/bazel-out/host/bin/external/io_bazel_rules_docker/container/build_tar.runfiles/bazel_source/tools/build_defs/pkg/archive.py", line 188, in add_dir
    depth - 1)
  File "/workspace/.bazel/sandbox/processwrapper-sandbox/76/execroot/services_core/bazel-out/host/bin/external/io_bazel_rules_docker/container/build_tar.runfiles/bazel_source/tools/build_defs/pkg/archive.py", line 198, in add_dir
    mode=mode)
  File "/workspace/.bazel/sandbox/processwrapper-sandbox/76/execroot/services_core/bazel-out/host/bin/external/io_bazel_rules_docker/container/build_tar.runfiles/bazel_source/tools/build_defs/pkg/archive.py", line 283, in add_file
    self._addfile(tarinfo, f)
  File "/workspace/.bazel/sandbox/processwrapper-sandbox/76/execroot/services_core/bazel-out/host/bin/external/io_bazel_rules_docker/container/build_tar.runfiles/bazel_source/tools/build_defs/pkg/archive.py", line 206, in _addfile
    self.tar.addfile(info, fileobj)
  File "/usr/lib/python2.7/tarfile.py", line 2052, in addfile
    copyfileobj(fileobj, self.fileobj, tarinfo.size)
  File "/usr/lib/python2.7/tarfile.py", line 275, in copyfileobj
    dst.write(buf)
IOError: [Errno 12] Cannot allocate memory

I have also had it error with: Server terminated abruptly (error code: 14, error message: '', log file: '/workspace/.bazel/server/jvm.out')

We do have a bundle of 12 images that are being built in a docker container, is there are recommended strategy for limiting the resource usage of rules_docker? Maybe turning down the parallelism so that it doesn't end up running out of memory?

@enriched
Copy link

As far as I have found in the bazel docs, the flags for controlling resource usage are:

I'm going to try fiddling around with them and see if I can get the build to work.

@enriched
Copy link

Running with bazel build --jobs=1 fixed the issue on circle-ci running within a docker executor.

@Globegitter
Copy link
Contributor Author

@enriched I guess the only problem with --jobs=1 you really are paying with performance. I can also recommend playing with startup --host_jvm_args=-Xmx3G (or whatever makes sense)

@enriched
Copy link

@Globegitter I ended up bumping up the resource_class to large for the executor on CircleCI and it fixed my issue without needing to specify the number of jobs. I guess these rules just need more than 4GB of RAM. Although I was able to get it to work with 3 jobs on a medium sized executor, I think that fiddling with the --ram_utilization_factor might be the way to go.

I don't understand how the Bazel scheduler estimates the amount resources an action is going to use, but it seems like it is underestimating some of the docker rules.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_docker!

@github-actions github-actions bot added the Can Close? Will close in 30 days unless there is a comment indicating why not label Mar 18, 2021
@github-actions
Copy link

This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Can Close? Will close in 30 days unless there is a comment indicating why not
Projects
None yet
Development

No branches or pull requests

3 participants