Skip to content

Commit

Permalink
standard docs for templates
Browse files Browse the repository at this point in the history
  • Loading branch information
musoles committed Sep 24, 2024
1 parent e99a046 commit b9e7616
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 1 deletion.
4 changes: 4 additions & 0 deletions templates/dummy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ This is an example implementation of a Kalavai template. Use it to bootstrap the

None

## Key template variables

- `deployment_name`: Name of the deployment job

## How to use

```bash
Expand Down
16 changes: 15 additions & 1 deletion templates/vllm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,22 @@ Deploy LLM models across multiple worker nodes using the vLLM library.

This template makes heavy use of the [vLLM library](https://docs.vllm.ai/en/latest/index.html).

## Key template variables

- `num_workers`: Number of workers per deployment (for tensor parallelism, i.e. how many pieces to divide the model into)
- `model_id`: Huggingface model id to load from [Huggingface](https://huggingface.co/models)
- `hf_token`: Huggingface token, required to load licensed model weights


## How to use

Get default values, edit them and deploy:
```bash
kalavai job defaults vllm > values.yaml
# edit values.yaml as required
kalavai job run vllm --values-path values.yaml
```

Find out the url endpoint of the model with:

```bash
Expand Down Expand Up @@ -46,4 +60,4 @@ client = OpenAI(
completion = client.completions.create(model="facebook/opt-350m",
prompt="San Francisco is a")
print("Completion result:", completion)
```
```

0 comments on commit b9e7616

Please sign in to comment.