Skip to content
This repository has been archived by the owner on Feb 15, 2025. It is now read-only.

chore: update vllm to use gptq quanitzed model #378

Merged
merged 3 commits into from
Apr 10, 2024

Conversation

YrrepNoj
Copy link
Member

This PR changes the default model that the vLLM container uses.

This PR captures the work completed in defenseunicorns/leapfrogai-backend-vllm:PR#21 that was not migrated over to the monorepo yet.

@YrrepNoj YrrepNoj requested a review from a team as a code owner April 10, 2024 19:07
Copy link

netlify bot commented Apr 10, 2024

Deploy Preview for leapfrogai-docs canceled.

Name Link
🔨 Latest commit 3e85b92
🔍 Latest deploy log https://app.netlify.com/sites/leapfrogai-docs/deploys/6616ed310bf7eb000828ed81

@gerred gerred merged commit dc1029d into main Apr 10, 2024
7 checks passed
@gerred gerred deleted the 357-update-vllm-to-gptq-quantized-model branch April 10, 2024 22:19
andrewrisse pushed a commit that referenced this pull request Apr 17, 2024
* chore: update vllm to use gptq quanitzed model

* bug: fix catch-all wildcard for e2e workflow

Signed-off-by: Andrew Risse <andrewrisse@gmail.com>
andrewrisse pushed a commit that referenced this pull request Apr 17, 2024
* chore: update vllm to use gptq quanitzed model

* bug: fix catch-all wildcard for e2e workflow

Signed-off-by: Andrew Risse <andrewrisse@gmail.com>
andrewrisse pushed a commit that referenced this pull request Apr 17, 2024
* chore: update vllm to use gptq quanitzed model

* bug: fix catch-all wildcard for e2e workflow

Signed-off-by: Andrew Risse <andrewrisse@gmail.com>
andrewrisse pushed a commit that referenced this pull request Apr 17, 2024
* chore: update vllm to use gptq quanitzed model

* bug: fix catch-all wildcard for e2e workflow
andrewrisse pushed a commit that referenced this pull request Apr 18, 2024
* chore: update vllm to use gptq quanitzed model

* bug: fix catch-all wildcard for e2e workflow
andrewrisse pushed a commit that referenced this pull request Apr 18, 2024
* chore: update vllm to use gptq quanitzed model

* bug: fix catch-all wildcard for e2e workflow
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants