Some users still missing container metrics. #1635

dashpole · 2017-04-10T16:21:04Z

@stensonb, @dzavalkinolx, @andrewsykim have all said that in cAdvisor v0.25.0 (or kubernetes v1.5.6 or v1.6.0) that they still have issues with disappearing metrics.

To help us get to the bottom of this, please provide OS, OS version, and cadvisor/kubernetes version. I suspect that this is a variant of #1572, which is caused by incorrectly adding the "container": "/system.slice/var-lib-docker-containers-acf76f8a0cf47638f7ab7c7e033872672479017f7447f90d2d6d6d5c39bf7536-shm.mount". See #1572 for more details. Please check your logs for other containers that could have been added incorrectly.

stensonb · 2017-04-11T17:23:09Z

FYI - I observed this with 1.5.6. I've subsequently upgraded to 1.6.1, and will report the error here if I see it again. So far, after 24 hours, nothing.

dashpole · 2017-04-11T17:45:38Z

I just realized that kubernetes/kubernetes#39477 never made it into the 1.5 branch. That is probably why people are still experiencing this...

carlpett · 2017-07-05T15:17:47Z

I think I may be seeing this? The issue I'm having is that we seemingly at random do not get all the containers, or at least not all labels, on the /metric endpoint. As an example, here I'm grepping for the container_cpu_usage_seconds_total metric, and seeing if it has an image label:

# curl -s localhost:9190/metrics | grep -E 'container_cpu_usage_seconds_total.+image' | wc -l
580
# curl -s localhost:9190/metrics | grep -E 'container_cpu_usage_seconds_total.+image' | wc -l
20
# curl -s localhost:9190/metrics | grep -E 'container_cpu_usage_seconds_total.+image' | wc -l
159
# curl -s localhost:9190/metrics | grep -E 'container_cpu_usage_seconds_total.+image' | wc -l
0
# curl -s localhost:9190/metrics | grep -E 'container_cpu_usage_seconds_total.+image' | wc -l
159
# curl -s localhost:9190/metrics | grep -E 'container_cpu_usage_seconds_total.+image' | wc -l
580

(These lines were in quick succession.)

We're running cadvisor as a systemd service (not in a container). Tried upgrading from 0.23.8 to 0.26.1, no difference.
OS is CentOS Linux release 7.2.1511 (Core), docker 17.05.0-ce. We are not using Kubernetes.

(I'm also having random SIGSEGVs on startup, but that seems unrelated)

maxramqvist · 2017-07-07T09:35:39Z

Seeing the same issue.
Running cAdvisor 0.26.1 as container.
curl:ing the prometheus endpoint and grep for unique container_label_image 30 times in a row with a seconds pause would get me either 0, 8 or 18. Never the correct 28.
Docker version 17.05.0-ce, build 89658be
Ubuntu 16.10 kernel 4.8 x86_64

mindw · 2017-07-14T14:09:55Z

we had our metrics disappear after exactly 24 hours and tracked down the issue to be caused by the period 24 hour rkt-gc.timer.

kubelet is run natively on the host (kubelet-wrapper not used)
k8s 1.3.x-1.5.x
CoreOs 13xx.x
rkt was used to run one-shot containers during boot.

Either disabling the rkt-gc.timer or using the kubelet-wrapper resolves the issue.

zeisss · 2017-07-25T12:47:53Z

We are seeing this too. Cadvisor as a systemd service, with Docker 1.13.1. No Kubernetes.

cadvisor_version_info{cadvisorRevision="d19cc94",cadvisorVersion="v0.26.1",dockerVersion="1.13.1",kernelVersion="3.16.0-4-amd64",osVersion="Debian GNU/Linux 8 (jessie)"} 1#

The WebUI shows the disappearing containers just fine.

dashpole · 2018-02-01T19:44:26Z

closing as this is outdated. This shouldn't be an issue in newer versions

dashpole added kind/bug systemd labels Apr 10, 2017

zeisss mentioned this issue Jul 25, 2017

Inconsistent container metrics in prometheus route #1704

Closed

dashpole closed this as completed Feb 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some users still missing container metrics. #1635

Some users still missing container metrics. #1635

dashpole commented Apr 10, 2017 •

edited

Loading

stensonb commented Apr 11, 2017

dashpole commented Apr 11, 2017

carlpett commented Jul 5, 2017

maxramqvist commented Jul 7, 2017 •

edited

Loading

mindw commented Jul 14, 2017 •

edited

Loading

zeisss commented Jul 25, 2017 •

edited

Loading

dashpole commented Feb 1, 2018

Some users still missing container metrics. #1635

Some users still missing container metrics. #1635

Comments

dashpole commented Apr 10, 2017 • edited Loading

stensonb commented Apr 11, 2017

dashpole commented Apr 11, 2017

carlpett commented Jul 5, 2017

maxramqvist commented Jul 7, 2017 • edited Loading

mindw commented Jul 14, 2017 • edited Loading

zeisss commented Jul 25, 2017 • edited Loading

dashpole commented Feb 1, 2018

dashpole commented Apr 10, 2017 •

edited

Loading

maxramqvist commented Jul 7, 2017 •

edited

Loading

mindw commented Jul 14, 2017 •

edited

Loading

zeisss commented Jul 25, 2017 •

edited

Loading