-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cAdvisor is confused by /system.slice/var-lib-docker-containers-...-shm.mount cgroups and may report zero-valued stats for Docker containers #1572
Comments
@ncdc @pmorie @smarterclayton -- i think this will impact us. need to invetigate further. |
i sent a patch to ignore any .mount cgroup in docker (as it should just care about the .scope cgroups) |
Thanks for getting this fixed promptly! Is there's going to be a cAdvisor 0.24.x or other release containing this fix in the near future? |
Is kubernetes-retired/heapster#1438 the same issue? |
Is there a workaround for this (from the user's perspective)? |
The workaround that we used was to stop using the |
When using cAdvisor to monitor Docker container stats, cAdvisor seems to get confused by any cgroups that happen to contain an existing Docker container ID in their basename. This includes the systemd mount unit for the
/var/lib/docker/containers/*/shm
mountpoint, which seems to randomly result in cAdvisor returning incorrect (all-zero) stats from the/api/v1.2/docker/...
endpoint.Symptoms
The
/api/v1.2/docker
endpoint returns a mixture of{"/docker/*: ..., "/system.slice/var-lib-docker-containers-*-shm.mount" ...}
entries. The/api/v1.2/docker/*
endpoint may occasionally return zero-valued CPU and memory stats.Details
With CoreOS 1235.6.0 + systemd 231 + Docker 1.12.3 + Linux 4.7.3-coreos-r2, the cAdvisor Docker driver seems to pick up two cgroups for each running Docker container:
In this case,
/docker/acf76f8a0cf47638f7ab7c7e033872672479017f7447f90d2d6d6d5c39bf7536
is the correct cgroup, which contains the processes within the Docker container. The/system.slice/var-lib-docker-containers-acf76f8a0cf47638f7ab7c7e033872672479017f7447f90d2d6d6d5c39bf7536-shm.mount
cgroup is empty, and associated with the/var/lib/docker/containers/*/shm
mountpoint.Both of these container entries will associate themselves with the information returned by the Docker API for that container ID, and will thus have an identical set of cAdvisor aliases. Both container entries will add identical
namespacedContainerName{"docker", "acf76f8a0cf47638f7ab7c7e033872672479017f7447f90d2d6d6d5c39bf7536"}
entries in thegithub.com/google/cadvisor/manager:manager.containers
map, which override each other depending on the order that the cgroups are listed in, and may change as events come in?The
/api/v1.2/docker/ID
will then semi-randomly return one of these two cgroup entries:The
/docker/...
one has useful CPU/memory stats, but the/system.slice/var-lib-docker-containers-...-shm.mount
one reports zero for all CPU and memory usage figures. This is probably because the/system.slice/....mount
cgroup is empty, and does not contain any processes.It seems like cAdvisor has a systemd factory which is intended to ignore the systemd
mount
cgroups: https://github.com/google/cadvisor/blob/v0.24.1/container/systemd/factory.go#L42However, the manager seems to register this systemd factory after the Docker factory, and thus the Docker factory will pick up the cgroup before the systemd factory has a chance to filter it out.
Workaround
The
/api/v1.2/containers/docker/ID
API will always return the actual container cgroup stats, assuming that Docker places it's container cgroups directly under/docker/ID
. This is not always the case.Related issues
The text was updated successfully, but these errors were encountered: