Skip to content

Commit

Permalink
Merge pull request #596 from elezar/bump-version-v0.15.0-rc.2
Browse files Browse the repository at this point in the history
Bump version to v0.15.0-rc.2
  • Loading branch information
elezar authored Mar 16, 2024
2 parents 9b2ea06 + bcc2a47 commit e58985a
Show file tree
Hide file tree
Showing 11 changed files with 24 additions and 19 deletions.
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,15 @@
## Changelog

### Version v0.15.0-rc.2
- Bump CUDA base image version to 12.3.2
- Add `cdi-cri` device list strategy. This uses the CDIDevices CRI field to request CDI devices instead of annotations.
- Set MPS memory limit by device index and not device UUID. This is a workaround for an issue where
these limits are not applied for devices if set by UUID.
- Update MPS sharing to disallow requests for multiple devices if MPS sharing is configured.
- Set mps device memory limit by index.
- Explicitly set sharing.mps.failRequestsGreaterThanOne = true.
- Run tail -f for each MPS daemon to output logs.
- Enforce replica limits for MPS sharing.

### Version v0.15.0-rc.1
- Import GPU Feature Discovery into the GPU Device Plugin repo. This means that
Expand Down
4 changes: 2 additions & 2 deletions deployments/helm/nvidia-device-plugin/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ apiVersion: v2
name: nvidia-device-plugin
type: application
description: A Helm chart for the nvidia-device-plugin on Kubernetes
version: "0.15.0-rc.1"
appVersion: "0.15.0-rc.1"
version: "0.15.0-rc.2"
appVersion: "0.15.0-rc.2"
kubeVersion: ">= 1.10.0-0"
home: https://github.com/NVIDIA/k8s-device-plugin

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ metadata:
name: gpu-feature-discovery
labels:
app.kubernetes.io/name: gpu-feature-discovery
app.kubernetes.io/version: 0.14.2
app.kubernetes.io/version: 0.15.0-rc.2
app.kubernetes.io/part-of: nvidia-gpu
spec:
selector:
Expand All @@ -15,11 +15,11 @@ spec:
metadata:
labels:
app.kubernetes.io/name: gpu-feature-discovery
app.kubernetes.io/version: 0.14.2
app.kubernetes.io/version: 0.15.0-rc.2
app.kubernetes.io/part-of: nvidia-gpu
spec:
containers:
- image: nvcr.io/nvidia/k8s-device-plugin:v0.14.3
- image: nvcr.io/nvidia/k8s-device-plugin:v0.15.0-rc.2
name: gpu-feature-discovery
command: ["/usr/bin/gpu-feature-discovery"]
volumeMounts:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ metadata:
name: gpu-feature-discovery
labels:
app.kubernetes.io/name: gpu-feature-discovery
app.kubernetes.io/version: 0.14.2
app.kubernetes.io/version: 0.15.0-rc.2
app.kubernetes.io/part-of: nvidia-gpu
spec:
selector:
Expand All @@ -15,11 +15,11 @@ spec:
metadata:
labels:
app.kubernetes.io/name: gpu-feature-discovery
app.kubernetes.io/version: 0.14.2
app.kubernetes.io/version: 0.15.0-rc.2
app.kubernetes.io/part-of: nvidia-gpu
spec:
containers:
- image: nvcr.io/nvidia/k8s-device-plugin:v0.14.3
- image: nvcr.io/nvidia/k8s-device-plugin:v0.15.0-rc.2
name: gpu-feature-discovery
command: ["/usr/bin/gpu-feature-discovery"]
volumeMounts:
Expand Down
6 changes: 3 additions & 3 deletions deployments/static/gpu-feature-discovery-daemonset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ metadata:
name: gpu-feature-discovery
labels:
app.kubernetes.io/name: gpu-feature-discovery
app.kubernetes.io/version: 0.14.2
app.kubernetes.io/version: 0.15.0-rc.2
app.kubernetes.io/part-of: nvidia-gpu
spec:
selector:
Expand All @@ -15,11 +15,11 @@ spec:
metadata:
labels:
app.kubernetes.io/name: gpu-feature-discovery
app.kubernetes.io/version: 0.14.2
app.kubernetes.io/version: 0.15.0-rc.2
app.kubernetes.io/part-of: nvidia-gpu
spec:
containers:
- image: nvcr.io/nvidia/k8s-device-plugin:v0.14.3
- image: nvcr.io/nvidia/k8s-device-plugin:v0.15.0-rc.2
name: gpu-feature-discovery
command: ["/usr/bin/gpu-feature-discovery"]
volumeMounts:
Expand Down
6 changes: 3 additions & 3 deletions deployments/static/gpu-feature-discovery-job.yaml.template
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,19 @@ metadata:
name: gpu-feature-discovery
labels:
app.kubernetes.io/name: gpu-feature-discovery
app.kubernetes.io/version: 0.14.2
app.kubernetes.io/version: 0.15.0-rc.2
app.kubernetes.io/part-of: nvidia-gpu
spec:
template:
metadata:
labels:
app.kubernetes.io/name: gpu-feature-discovery
app.kubernetes.io/version: 0.14.2
app.kubernetes.io/version: 0.15.0-rc.2
app.kubernetes.io/part-of: nvidia-gpu
spec:
nodeName: NODE_NAME
containers:
- image: nvcr.io/nvidia/k8s-device-plugin:v0.14.3
- image: nvcr.io/nvidia/k8s-device-plugin:v0.15.0-rc.2
name: gpu-feature-discovery
command: ["/usr/bin/gpu-feature-discovery"]
args:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ spec:
# See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
priorityClassName: "system-node-critical"
containers:
- image: nvcr.io/nvidia/k8s-device-plugin:v0.14.4
- image: nvcr.io/nvidia/k8s-device-plugin:v0.15.0-rc.2
name: nvidia-device-plugin-ctr
env:
- name: FAIL_ON_INIT_ERROR
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ spec:
- env:
- name: PASS_DEVICE_SPECS
value: "true"
image: nvcr.io/nvidia/k8s-device-plugin:v0.15.0-rc.1
image: nvcr.io/nvidia/k8s-device-plugin:v0.15.0-rc.2
name: nvidia-device-plugin-ctr
securityContext:
privileged: true
Expand Down
2 changes: 1 addition & 1 deletion deployments/static/nvidia-device-plugin.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ spec:
# See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
priorityClassName: "system-node-critical"
containers:
- image: nvcr.io/nvidia/k8s-device-plugin:v0.15.0-rc.1
- image: nvcr.io/nvidia/k8s-device-plugin:v0.15.0-rc.2
name: nvidia-device-plugin-ctr
env:
- name: FAIL_ON_INIT_ERROR
Expand Down
2 changes: 1 addition & 1 deletion nvidia-device-plugin.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ spec:
# See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
priorityClassName: "system-node-critical"
containers:
- image: nvcr.io/nvidia/k8s-device-plugin:v0.15.0-rc.1
- image: nvcr.io/nvidia/k8s-device-plugin:v0.15.0-rc.2
name: nvidia-device-plugin-ctr
env:
- name: FAIL_ON_INIT_ERROR
Expand Down
2 changes: 1 addition & 1 deletion versions.mk
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ MODULE := github.com/NVIDIA/$(DRIVER_NAME)

REGISTRY ?= nvcr.io/nvidia

VERSION ?= v0.15.0-rc.1
VERSION ?= v0.15.0-rc.2

# vVERSION represents the version with a guaranteed v-prefix
vVERSION := v$(VERSION:v%=%)
Expand Down

0 comments on commit e58985a

Please sign in to comment.