Skip to content

Commit

Permalink
docs: add doc for inference (#514)
Browse files Browse the repository at this point in the history
**Reason for Change**:
Add docs to show how to inference models, add examples for models with
adapters

**Requirements**

- [ ] added unit tests and e2e tests (if applicable).

**Issue Fixed**:
<!-- If this PR fixes GitHub issue 4321, add "Fixes #4321" to the next
line. -->

**Notes for Reviewers**:

Signed-off-by: Bangqi Zhu <bangqizhu@microsoft.com>
Co-authored-by: Bangqi Zhu <bangqizhu@microsoft.com>
  • Loading branch information
bangqipropel and Bangqi Zhu authored Jul 12, 2024
1 parent e5991b0 commit 029d11f
Show file tree
Hide file tree
Showing 2 changed files with 65 additions and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ The detailed usage for Kaito supported models can be found in [**HERE**](presets

The number of the supported models in Kaito is growing! Please check [this](./docs/How-to-add-new-models.md) document to see how to add a new supported model.

In version v0.3.0, Kaito starts to support model fine-tuning and using fine-tuned adapters in inference service. Please check this [document](./docs/tuning/README.md).
Starting with version v0.3.0, Kaito supports model fine-tuning and using fine-tuned adapters in the inference service. Refer to the [tuning document](./docs/tuning/README.md) and [inference document](./docs/inference/README.md) for more information.

## FAQ

Expand Down
64 changes: 64 additions & 0 deletions docs/inference/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Kaito Inference Workspace API

This guide provides instructions on how to use the Kaito Inference Workspace API for basic model serving and serving with LoRA adapters.

## Getting Started

To use the Kaito Inference Workspace API, you need to define a Workspace custom resource (CR). Below are examples of how to define the CR and its various components.

## Example Workspace Definitions
Here are three examples of using the API to define a workspace for inferencing different models:

Example 1: Inferencing [`phi-3-mini`](../../examples/inference/kaito_workspace_phi_3.yaml)

Example 2: Inferencing [`falcon-7b`](../../examples/inference/kaito_workspace_falcon_7b.yaml) without adapters

Example 3: Inferencing `falcon-7b` with adapters

```yaml
apiVersion: kaito.sh/v1alpha1
kind: Workspace
metadata:
name: workspace-falcon-7b
resource:
instanceType: "Standard_NC12s_v3"
labelSelector:
matchLabels:
apps: falcon-7b
inference:
preset:
name: "falcon-7b"
adapters:
- source:
name: "falcon-7b-adapter"
image: "<YOUR_IMAGE>"
strength: "0.2"
```
Multiple adapters can be added:
```yaml
apiVersion: kaito.sh/v1alpha1
kind: Workspace
metadata:
name: workspace-falcon-7b
resource:
instanceType: "Standard_NC12s_v3"
labelSelector:
matchLabels:
apps: falcon-7b
inference:
preset:
name: "falcon-7b"
adapters:
- source:
name: "falcon-7b-adapter"
image: "<YOUR_IMAGE>"
strength: "0.2"
- source:
name: "additional-source"
image: "<YOUR_ADDITIONAL_IMAGE>"
strength: "0.5"
```
Currently, only images are supported as adapter sources, with a default strength of "1.0".

0 comments on commit 029d11f

Please sign in to comment.