-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Automate the adapters manifests #463
Conversation
**Reason for Change**: Add adapters to inference API and update the required package **Requirements** - [ ] added unit tests and e2e tests (if applicable). **Issue Fixed**: <!-- If this PR fixes GitHub issue 4321, add "Fixes #4321" to the next line. --> **Notes for Reviewers**: Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>
Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>
Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>
Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>
Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>
Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>
Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>
Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>
Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>
Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #463 +/- ##
==========================================
- Coverage 61.09% 60.04% -1.05%
==========================================
Files 29 29
Lines 2303 2553 +250
==========================================
+ Hits 1407 1533 +126
- Misses 828 926 +98
- Partials 68 94 +26 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>
Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>
Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>
examples/inference/kaito_workspace_falcon_7b_with_adapters.yaml
Outdated
Show resolved
Hide resolved
@@ -56,13 +57,14 @@ var ( | |||
tolerations = []corev1.Toleration{ | |||
{ | |||
Effect: corev1.TaintEffectNoSchedule, | |||
Operator: corev1.TolerationOpEqual, | |||
Key: resources.GPUString, | |||
Operator: corev1.TolerationOpExists, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change from corev1.TolerationOpEqual
to corev1.TolerationOpExists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The origin one was different from the yaml file, and it will make the adapter cannot find the node
Operator: corev1.TolerationOpEqual, | ||
Key: resources.GPUString, | ||
Operator: corev1.TolerationOpExists, | ||
Key: resources.CapacityNvidiaGPU, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similiar question on why change to resources.CapacityNvidiaGPU here. these tolerations should ideally not need to be changed...
Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>
**Reason for Change**: <!-- What does this PR improve or fix in Kaito? Why is it needed? --> **Requirements** - [ ] added unit tests and e2e tests (if applicable). **Issue Fixed**: <!-- If this PR fixes GitHub issue 4321, add "Fixes #4321" to the next line. --> **Notes for Reviewers**: --------- Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com> Co-authored-by: Bangqi Zhu <bangqizhu@microsoft.com>
Reason for Change:
Requirements
Issue Fixed:
Notes for Reviewers: