-
-
Notifications
You must be signed in to change notification settings - Fork 14.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama-cpp: add support for Vulkan backend #287554
Conversation
I'd wait to land this until ggerganov/llama.cpp#5311 lands, so that there aren't the awkward darwin checks. More broadly speaking, Vulkan support is aiming to be the new "default" llama.cpp backend, superseding the opencl backend. (At least, that's what I understand from @0cc4m's PRs and comments.) If that comes to be, I expect the OpenCL backend to diminish and get relegated to a non-default state, and the Vulkan backend be promoted to default. That moment hasn't arrived, but I'm looking forward to when it does. |
I don´t mind either way. The one thing that's not properly marked as broken right now are Darwin builds on ARM with Darwin on Intel is marked as broken in any case, but someone could also check if that actually starts working after Mac support is merged upstream and we change the package to make I will not do either of those things myself since I don´t have any Macs at home. |
Im integrating upstream flake changes in #289513 |
In general I don't mind, but I have some concerns about the current state of your PR which I will mention in a comment there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that pursuing #289513 is the better path.
Ok, I will close this then. |
Description of changes
This adds support for the newly added Vulkan backend for llama-cpp, implemented in the same way as ggerganov/llama.cpp#5173.
The one thing that is tricky about this is the
vulkanSupport
flag that I added to the function arguments alongside the existing argumentscudaSupport
,rocmSupoort
,openclSupport
,blasSupport
andmetalSupport
, which raises two questions for me.Does using
vulkanSupport
directly work for all users?I am not sure if we would expect people to set
cudaSupport
orrocmSupport
at the same time asvulkanSupport
and if the result would/should be sane if it's passed tollama-cpp
like that or if the package should do some prioritization and compile with the faster vendor-specific API only when it is set alongside of Vulkan. The upstream flake has separate outputs for each backend.Will using
vulkanSupport
directly still make sense if even more backends are added?A number of other backends have been added to the upstream at the same time with the one from this PR (see ggerganov/llama.cpp#5138).
There is a second Vulkan backend that goes through Kompute and the authors of those two Vulkan backends want to merge them eventually. There is also a SYCL backend for Intel GPUs.
If we would add support for those backends as well I am not sure how users should access those in a way that is idiomatic for nixpkgs:
useSYCL
flag?Things done
nix.conf
? (See Nix manual)sandbox = relaxed
sandbox = true
nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)Add a 👍 reaction to pull requests you find important.