Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.4.0.0 com.docker.hyperkit very high CPU with low load in containers #4981

Closed
2 tasks done
zwass opened this issue Oct 8, 2020 · 19 comments
Closed
2 tasks done

2.4.0.0 com.docker.hyperkit very high CPU with low load in containers #4981

zwass opened this issue Oct 8, 2020 · 19 comments

Comments

@zwass
Copy link

zwass commented Oct 8, 2020

  • I have tried with the latest version of my channel (Stable or Edge)
  • I have uploaded Diagnostics
  • Diagnostics ID: FBD1077A-2DEC-4C37-A7E5-7E917A732473/20201008014709

Expected behavior

Docker CPU usage is low when container usage is low.

Actual behavior

Containers are using very little CPU (~5% between all containers as reported by docker stats). com.docker.hyperkit is using 300% CPU (as reported by Activity Monitor).

Screen Shot 2020-10-07 at 6 46 59 PM

Information

  • macOS Version: 10.14.6

I think this problem started with Docker 2.4.0.0.

Diagnostic logs

I don't see this option in the menu. Are you looking for the diagnostics ID asked for above? FBD1077A-2DEC-4C37-A7E5-7E917A732473/20201008014709

Docker for Mac: 2.4.0.0

Steps to reproduce the behavior

I am running this docker-compose file: https://github.com/kolide/fleet/blob/master/docker-compose.yml

It does not reproduce immediately, but I have seen the issue come up at least 3 or 4 times in the last week.

@stephen-turner
Copy link
Contributor

Thanks, @zwass, we'll have a look

@zwass
Copy link
Author

zwass commented Oct 8, 2020

I reverted to 2.3.0.5 and have not seen the issue crop up after a few hours of running that compose stack. It was usually happening regularly and quickly with 2.4.0.0 so I suspect there are new changes that introduced the issue.

I will update with further observations.

@ant-
Copy link

ant- commented Oct 9, 2020

It is a well-known problem.

@stephen-turner
Copy link
Contributor

Actually, this one is new. We have now reproduced it and will have a fix soon.

@zwass
Copy link
Author

zwass commented Oct 9, 2020

Glad to hear it and thank you @stephen-turner.

@djs55
Copy link
Contributor

djs55 commented Oct 14, 2020

@zwass, thanks for your report! I notice there's a container which bind-mounts /:

    {
      "ContainerID": "8ac93e23ff4df2a03be6b3c83a88ba9487de626ec93b339023f17ea7d20d2d8f",
      "Volumes": [
        {
          "HostPath": "/",
          "MountPath": "/rootfs"
        }
    },

Using the old file-sharing implementation ("osxfs") this was slightly buggy:

  • I can run docker run -v /:/rootfs alpine sh even though / is not listed in the UI (whale menu -> Preferences -> Resources -> File Sharing).
  • it does not watch the host filesystem to inject inotify file events. This may or may not be a problem, depending on the use-case.

Due to the inotify file events bug, the CPU usage is quite low.

Using the new file-sharing implementation ("gRPC FUSE"):

  • Again I can run docker run -v /:/rootfs alpine sh even though / is not listed in the UI.
  • It does watch the host filesystem to inject inotify events (and more importantly cache invalidation messages)

Unfortunately due to fixing the inotify file events bug, the CPU usage is now very high.

I'm not sure yet what the best way to fix this is. Could you describe your use-case in more detail? What is the /rootfs bind-mount used for? Could it be decomposed into smaller bind-mounts like /var/lib, /var/run etc?

@zwass
Copy link
Author

zwass commented Oct 14, 2020

@djs55 that rootfs mount is just copied out of the CAdvisor documentation into the above linked docker-compose file.

Presumably this issue is effecting anyone who is running CAdvisor (11.3k stars) with the documented instructions.

@djs55
Copy link
Contributor

djs55 commented Oct 14, 2020

@zwass thanks for the explanation, very helpful!

@simonferquel
Copy link

The issue has been fixed in Edge release 2.4.2.0, thank you for reporting it.

@zwass
Copy link
Author

zwass commented Oct 19, 2020

Thank you! Does that mean it should make the next stable release?

@stephen-turner
Copy link
Contributor

Yes, @zwass.

@ebriney
Copy link
Member

ebriney commented Oct 23, 2020

Hi @zwass. We fixed more issues related to gRPC Fuse, can you try that build and give us some feedback please:
https://desktop-stage.docker.com/mac/edge/49130/Docker.dmg

We noticed a performance issue when creating files from a container and we start to work on optimize it.
Thanks for your help.

@Nuru
Copy link

Nuru commented Oct 24, 2020

I had a bunch of containers idling and CPU usage was low. Then while doing something in the Opera browser on the host Mac, com.docker.hyperkit usage went up to 300% or so for quite a while. I killed a container and that did not stop it. I started to collect diagnostics, which took over a minute, and by the end of the diagnostics collection, hyperkit usage was back down to normal (5-10% CPU).

So, not sure if the diagnostics caught the problem, and not sure if this is the same problem as the OP, but seems close enough to add to this issue rather than open a new one.

Uploaded Diagnostics ID: 7DDDF979-417F-442C-B930-719FC131249A/20201024212048

Update: happened again, this time while using Firefox: 7DDDF979-417F-442C-B930-719FC131249A/20201026181559
and again: 7DDDF979-417F-442C-B930-719FC131249A/20201027024100

@Nuru
Copy link

Nuru commented Oct 29, 2020

Here is another Diagnostic showing com.docker.hyperkit using 300% CPU while containers are basically idle, but this time the high usage continued past the end of the diagnostic collection. Usually hyperkit settles down before the end of the collection. Diagnostic ID 7DDDF979-417F-442C-B930-719FC131249A/20201029104525

And another: 7DDDF979-417F-442C-B930-719FC131249A/20201101200821

@stephen-turner
Copy link
Contributor

The original bug is fixed in 2.5.0.0. If you have another CPU issue that can't be explained by the CPU load inside the containers, please open a new ticket. Thanks.

@exoer
Copy link

exoer commented Nov 25, 2020

For osx set the cached option on bind mounts as there is some filesystem issues.

    volumes:
      - ./api/:/opt/app:cached
      - /opt/app/node_modules

Avoid to bind mount things that should be volumes.

Do this

version: '3.8'

services:
 
 jenkins:
    image: jenkins-ansible
    build:
      context: jenkins-ansible  
      network: host   
    ports:
      - "8081:8080"
      - "50000:50000"
    volumes:
      - ./config:/var/config:delegated
      - jenkins_home:/var/jenkins_home
      - /var/run/docker.sock:/var/run/docker.sock


volumes:
  jenkins_home:

not this

volumes:
      - ./config:/var/config:delegated
      - ./jenkins_home:/var/jenkins_home

@krebbi
Copy link

krebbi commented Dec 1, 2020

I disabled Turbo Boost and got the temperature down to 65°C

Not a fix but a way to protect the hardware.

@kumarunster kumarunster mentioned this issue Dec 3, 2020
2 tasks
@vedmant
Copy link

vedmant commented Dec 18, 2020

I have the same issue, laptop battery is dying in just few hours (normally it lasts 6 hours, on macbook).

@docker-robott
Copy link
Collaborator

Closed issues are locked after 30 days of inactivity.
This helps our team focus on active issues.

If you have found a problem that seems similar to this, please open a new issue.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle locked

@docker docker locked and limited conversation to collaborators Jan 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests