Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix vLLM example #465

Merged
merged 1 commit into from
Oct 14, 2023
Merged

Fix vLLM example #465

merged 1 commit into from
Oct 14, 2023

Conversation

irfansharif
Copy link
Member

@irfansharif irfansharif commented Oct 13, 2023

Fixes #463. Pytorch 2.1.0 (https://github.com/pytorch/pytorch/releases/tag/v2.1.0) was just released just last week, and it's built using CUDA 12.1. The image we're using uses CUDA 11.8, as recommended by vLLM. Previously vLLM specified a dependency on torch>=2.0.0, and picked up this 2.1.0 version. That was pinned back to 2.0.1 in vllm-project/vllm#1290. When picking up that SHA however, we ran into what vllm-project/vllm#1239 fixes. So for now point to a temporary fork with that fix.

Fixes #463. Pytorch 2.1.0 (https://github.com/pytorch/pytorch/releases/tag/v2.1.0)
was just released just last week, and it's built using CUDA 12.1. The
image we're using uses CUDA 11.8, as recommended by vLLM. Previously
vLLM specified a dependency on torch>=2.0.0, and picked up this 2.1.0
version. That was pinned back to 2.0.1 in
vllm-project/vllm#1290. When picking up that SHA
however, we ran into what vllm-project/vllm#1239
fixes. So for now point to temporary fork with that fix.
@irfansharif irfansharif requested a review from aksh-at October 13, 2023 22:55
Copy link

@modal-pr-review-automation modal-pr-review-automation bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto-approved 👍. This diff qualified for automatic approval and doesn't need follow up review.

@irfansharif irfansharif merged commit b0f9499 into main Oct 14, 2023
@irfansharif irfansharif deleted the irfansharif/231013.vllm-example branch October 14, 2023 02:45
@erikbern
Copy link
Contributor

Nice, thanks for fixing!

gongy pushed a commit that referenced this pull request Jan 5, 2024
Fixes #463. Pytorch 2.1.0 (https://github.com/pytorch/pytorch/releases/tag/v2.1.0)
was just released just last week, and it's built using CUDA 12.1. The
image we're using uses CUDA 11.8, as recommended by vLLM. Previously
vLLM specified a dependency on torch>=2.0.0, and picked up this 2.1.0
version. That was pinned back to 2.0.1 in
vllm-project/vllm#1290. When picking up that SHA
however, we ran into what vllm-project/vllm#1239
fixes. So for now point to temporary fork with that fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"Fast inference with vLLM (Llama 2 13B)" example is broken
2 participants