-
Notifications
You must be signed in to change notification settings - Fork 652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bump version to v0.15.0-rc.2 #596
Conversation
- Bump CUDA base image version to 12.3.2 | ||
- Add `cdi-cri` device list strategy. This uses the CDIDevices CRI field to request CDI devices instead of annotations. | ||
- Set MPS memory limit by device index and not device UUID. This is a workaround for an issue where | ||
these limits are not applied for devices if set by UUID. | ||
- Update MPS sharing to disallow requests for multiple devices if MPS sharing is configured. | ||
- Enforce replica limits for MPS sharing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really all that changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❯ git log --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit -90 |grep -v Merge |grep -v Bump
11c7131e - Enforce maximum MPS replicas (25 hours ago) <Evan Lezar>
b3218699 - Run tail -f for each MPS daemon to output logs (2 days ago) <Evan Lezar>
93f161cc - Add mig-strategy flag to mps-control-daemon (2 days ago) <Evan Lezar>
4f3e4a58 - Cleanup log dir on stop (2 days ago) <Evan Lezar>
95be0832 - Explicitly set sharing.mps.failRequestsGreaterThanOne = true (2 days ago) <Evan Lezar>
13cf3b4c - Change validation logic for MPS sharing (2 days ago) <Evan Lezar>
d5f33b7d - Factor out allocate request validation (2 days ago) <Evan Lezar>
5edd66f6 - Set mps device memory limit by index (2 days ago) <Evan Lezar>
7302a18e - (origin/badge, badge) Add Status badges (8 days ago) <Carlos Eduardo Arango Gutierrez>
f3586af5 - (upstream/e2eactions, e2eactions) Add e2e github action (8 days ago) <Carlos Eduardo Arango Gutierrez>
00f34100 - Add dependabot config to update actions for gh-pages (9 days ago) <Evan Lezar>
6eb8d576 - Fix GitHub staging registry (9 days ago) <Evan Lezar>
dc38950a - Add cdi-cri device list strategy (11 days ago) <Evan Lezar>
cfcdcceb - Refactor label output (2 weeks ago) <Evan Lezar>
ca84c1b3 - Replace k8s-client.go with client sets (2 weeks ago) <Evan Lezar>
0727338f - Move use-node-feature-api to config structs (2 weeks ago) <Evan Lezar>
362d1d0f - Update nvidia-container-toolkit instructions (2 weeks ago) <Evan Lezar>
93b39b67 - (upstream/clean-go-mod) clean up replace directives in go.mod (2 weeks ago) <Tariq Ibrahim>
516945ce - TOFIX: Allow go mod tidy to set go version (2 weeks ago) <Evan Lezar>
2dbf357c - Add vendor check to actions (2 weeks ago) <Evan Lezar>
9e9fb58a - Extract GOLANG_VERSION from versions.mk (2 weeks ago) <Evan Lezar>
9cfdf86a - (origin/operations-per-run, operations-per-run) Increase operations-per-run on stale action (2 weeks ago) <Carlos Eduardo Arango Gutierrez>
ffb4b015 - Update github.com/mittwald/go-helm-client to v0.12.8 (2 weeks ago) <Evan Lezar>
86d94ffd - (origin/lifecyce, lifecyce) Edit stale message (2 weeks ago) <Carlos Eduardo Arango Gutierrez>
bc818aac - Use github image as staging image (3 weeks ago) <Evan Lezar>
4ac55b89 - Remove deprecated extensions/v1beta1 static deployment (3 weeks ago) <Evan Lezar>
6e209be5 - Update k8s.io/kubernetes to v1.28.7 (3 weeks ago) <Evan Lezar>
637326e1 - Fix typo on label name at stale action def (3 weeks ago) <Carlos Eduardo Arango Gutierrez>
cb9f53a2 - (origin/go-version-file, go-version-file) Update golang gh-Action to use go-version-file (3 weeks ago) <Carlos Eduardo Arango Gutierrez>
cb3949df - (origin/ghaction-stale, ghaction-stale) Add actions/stale gh-action (3 weeks ago) <Carlos Eduardo Arango Gutierrez>
778f0740 - Add changelog for v0.15.0-rc.1 (3 weeks ago) <Evan Lezar>
05897874 - Add MPS sharing section to README (3 weeks ago) <Evan Lezar>
non merge / non dependabot commits since rc1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say these are the relevant ones to include then:
11c7131e - Enforce maximum MPS replicas (25 hours ago) <Evan Lezar>
b3218699 - Run tail -f for each MPS daemon to output logs (2 days ago) <Evan Lezar>
95be0832 - Explicitly set sharing.mps.failRequestsGreaterThanOne = true (2 days ago) <Evan Lezar>
5edd66f6 - Set mps device memory limit by index (2 days ago) <Evan Lezar>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I have updated.
As a side note. It would be good to generate this automatically if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a way on the GitHub UI for it. I'll show you next week
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know about the release notes for releases, I mean to update the changelog.
But maybe the issue is that I try to keep it up to date while making the changes.
a391799
to
eddf39f
Compare
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
The static deployments were using inconsistent versions. This change updates them all to v0.15.0-rc.2. Signed-off-by: Evan Lezar <elezar@nvidia.com>
53302eb
to
bcc2a47
Compare
No description provided.