Skip to content

Commit

Permalink
Add NVIDIA_VISIBLE_DEVICES env var
Browse files Browse the repository at this point in the history
  • Loading branch information
gjulianm committed Feb 28, 2025
1 parent 3edc105 commit 6c283a7
Show file tree
Hide file tree
Showing 5 changed files with 17 additions and 3 deletions.
4 changes: 4 additions & 0 deletions charts/datadog/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Datadog changelog

## 3.98.2

* Add the `NVIDIA_VISIBLE_DEVICES` environment variable to the containers when GPU monitoring is enabled, as it might be needed if the NVIDIA device plugin does not have `accept-nvidia-visible-devices-as-volume-mount` enabled.

## 3.98.1

* Fixes bug that causes `DD_KUBERNETES_ANNOTATIONS_AS_TAGS` env var to be incorrectly set to the merged value of `.Values.datadog.kubernetesResourcesLabelsAsTags` and `.Values.datadog.kubernetesResourcesAnnotationsAsTags`.
Expand Down
2 changes: 1 addition & 1 deletion charts/datadog/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
apiVersion: v1
name: datadog
version: 3.98.1
version: 3.98.2
appVersion: "7"
description: Datadog Agent
keywords:
Expand Down
2 changes: 1 addition & 1 deletion charts/datadog/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Datadog

![Version: 3.98.1](https://img.shields.io/badge/Version-3.98.1-informational?style=flat-square) ![AppVersion: 7](https://img.shields.io/badge/AppVersion-7-informational?style=flat-square)
![Version: 3.98.2](https://img.shields.io/badge/Version-3.98.2-informational?style=flat-square) ![AppVersion: 7](https://img.shields.io/badge/AppVersion-7-informational?style=flat-square)

[Datadog](https://www.datadoghq.com/) is a hosted infrastructure monitoring platform. This chart adds the Datadog Agent to all nodes in your cluster via a DaemonSet. It also optionally depends on the [kube-state-metrics chart](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-state-metrics). For more information about monitoring Kubernetes with Datadog, please refer to the [Datadog documentation website](https://docs.datadoghq.com/agent/basic_agent_usage/kubernetes/).

Expand Down
7 changes: 6 additions & 1 deletion charts/datadog/templates/_container-agent.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@
value: {{ .Values.datadog.checksCardinality | quote }}
{{- end }}
- name: DD_CONTAINER_LIFECYCLE_ENABLED
value: {{ .Values.datadog.containerLifecycle.enabled | quote | default "true" }}
value: {{ .Values.datadog.containerLifecycle.enabled | quote | default "true" }}
- name: DD_ORCHESTRATOR_EXPLORER_ENABLED
value: {{ (include "should-enable-k8s-resource-monitoring" .) | quote }}
- name: DD_EXPVAR_PORT
Expand Down Expand Up @@ -205,6 +205,11 @@
- name: DD_OTELCOLLECTOR_ENABLED
value: "true"
{{- end }}
{{- if .Values.datadog.gpuMonitoring.enabled }}
# depending on the NVIDIA container toolkit configuration, we might need to request visible devices via this env var or via the /var/run/nvidia-container-devices/all volume mount
- name: NVIDIA_VISIBLE_DEVICES
value: all
{{- end }}
{{- include "additional-env-entries" .Values.agents.containers.agent.env | indent 4 }}
{{- include "additional-env-dict-entries" .Values.agents.containers.agent.envDict | indent 4 }}
volumeMounts:
Expand Down
5 changes: 5 additions & 0 deletions charts/datadog/templates/_container-system-probe.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,11 @@
- name: HOST_ROOT
value: "/host/root"
{{- end }}
{{- if .Values.datadog.gpuMonitoring.enabled }}
# depending on the NVIDIA container toolkit configuration, we might need to request visible devices via this env var or via the /var/run/nvidia-container-devices/all volume mount
- name: NVIDIA_VISIBLE_DEVICES
value: all
{{- end }}
{{- include "additional-env-entries" .Values.agents.containers.systemProbe.env | indent 4 }}
{{- include "additional-env-dict-entries" .Values.agents.containers.systemProbe.envDict | indent 4 }}
resources:
Expand Down

0 comments on commit 6c283a7

Please sign in to comment.