Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix kubernetes.memory.limits on kind clusters #11914

Merged
merged 1 commit into from
May 12, 2022

Conversation

L3n41c
Copy link
Member

@L3n41c L3n41c commented May 2, 2022

What does this PR do?

When the cadvisor endpoint of the kubelet exposes twice the container_spec_memory_limit_bytes metric, do not sum them.

Motivation

On kind clusters, the kubernetes.memory.limits metric reported by the agent is currently twice the real pod memory limit.
The kubernetes.cpu.limits, kubernetes.cpu.requests and kubernetes.memory.requests are all correct.

The root cause of this bad value is that the cadvisor endpoint of kind kubelet is reporting the memory limit twice for a given container.

Ex. with the following manifest (from DataDog/datadog-agent#10508):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-prepared
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-prepared
  template:
    metadata:
      labels:
        app: nginx-prepared
    spec:
      containers:
      - image: nginx:1.7.9
        name: nginx-prepared
        resources:
          limits:
            cpu: 200m
            memory: 20Mi
          requests:
            cpu: 100m
            memory: 10Mi

On kind, the kubelet cadvisor endpoint returns:

root@datadog-agent-linux-k9c4p:/# curl -s --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt -k -H "Authorization: Bearer $(</var/run/secrets/kubernetes.io/serviceaccount/token)" https://$DD_KUBERNETES_
KUBELET_HOST:10250/metrics/cadvisor | grep memory_limit | grep nginx | grep -v pause
container_spec_memory_limit_bytes{container="",id="/kubelet/kubepods/burstable/podd8cfdf76-9f01-4cab-8e8f-d1f6beda67f0",image="",name="",namespace="nginx-preloader-sample",pod="nginx-prepared-6f85667d88-msx5q"} 2.097152e+07
container_spec_memory_limit_bytes{container="nginx-prepared",id="/docker/6e9d93d4629c147c9f17e64c531efc2ebe78e94954ba11407bf38552670a2ac5/kubelet/kubepods/burstable/podd8cfdf76-9f01-4cab-8e8f-d1f6beda67f0/60ccc8b7bcfdcb5bda15b455aac0505b18a363b3ad4e918dcd113ecaae9ebafd",image="docker.io/library/nginx:1.7.9",name="60ccc8b7bcfdcb5bda15b455aac0505b18a363b3ad4e918dcd113ecaae9ebafd",namespace="nginx-preloader-sample",pod="nginx-prepared-6f85667d88-msx5q"} 2.097152e+07
container_spec_memory_limit_bytes{container="nginx-prepared",id="/kubelet/kubepods/burstable/podd8cfdf76-9f01-4cab-8e8f-d1f6beda67f0/60ccc8b7bcfdcb5bda15b455aac0505b18a363b3ad4e918dcd113ecaae9ebafd",image="docker.io/library/nginx:1.7.9",name="60ccc8b7bcfdcb5bda15b455aac0505b18a363b3ad4e918dcd113ecaae9ebafd",namespace="nginx-preloader-sample",pod="nginx-prepared-6f85667d88-msx5q"} 2.097152e+07

Whereas, for the same pod definition, a GKE kubelet cadvisor kubelet returns:

 lenaic.huard:~$ kubectl exec pod/datadog-agent-linux-qh7gp agent -c agent -- bash -c 'curl -s --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt -k -H "Authorization: Bearer $(</var/run/secrets/kubernetes.io/serviceaccount/token)" https://$DD_KUBERNETES_KUBELET_HOST:10250/metrics/cadvisor' | grep memory_limit | grep nginx | grep -v pause
container_spec_memory_limit_bytes{container="",id="/kubepods/burstable/pod5a8ed807-7a47-4bf2-9cbe-d624ba05fd88",image="",name="",namespace="nginx-preloader-sample",pod="nginx-prepared-6697ccc84-8wc6x"} 2.097152e+07
container_spec_memory_limit_bytes{container="nginx-prepared",id="/kubepods/burstable/pod5a8ed807-7a47-4bf2-9cbe-d624ba05fd88/ad131b04a5d33d4396bb455f2b66d0ff5a13c0b855770b2c396d28c4ab959a34",image="nginx@sha256:e3456c851a152494c3e4ff5fcc26f240206abac0c9d794affb40e0714846c451",name="k8s_nginx-prepared_nginx-prepared-6697ccc84-8wc6x_nginx-preloader-sample_5a8ed807-7a47-4bf2-9cbe-d624ba05fd88_0",namespace="nginx-preloader-sample",pod="nginx-prepared-6697ccc84-8wc6x"} 2.097152e+07

As shown above, on kind, the same value is reported twice for the same namespace,pod,container triplet.
Those two identical values are currently summed at

samples = self._sum_values_by_context(metric, self._get_entity_id_if_container_metric)

This sum results in the agent sending twice the expected value.

Additional Notes

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • PR must have changelog/ and integration/ labels attached

@L3n41c L3n41c requested review from a team as code owners May 2, 2022 14:04
@codecov
Copy link

codecov bot commented May 2, 2022

Codecov Report

Merging #11914 (30484c8) into master (1a0d5a7) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Flag Coverage Δ
kubelet 90.62% <100.00%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The kubernetes.memory.limits is incorrect when K8s cluster is in KIND(Kubernetes IN Docker) environment
2 participants