Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA backend #2310

Merged
merged 52 commits into from
May 15, 2024
Merged
Changes from 1 commit
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
dc03f81
remove lingering references to SOM license
cebtenzzre Apr 29, 2024
c482bae
backend: bring some upstream changes into llama.cpp.cmake
cebtenzzre Apr 29, 2024
1684f7b
backend: update llama.cpp and list of supported models
cebtenzzre Apr 30, 2024
dc8cd8c
backend: bring more upstream changes into llama.cpp.cmake
cebtenzzre Apr 30, 2024
75d05c5
llamamodel: remove dependency on internal llama_token_to_piece
cebtenzzre Apr 30, 2024
1ebd3cf
backend: initial port to Occam's Vulkan backend
cebtenzzre Apr 30, 2024
9f92731
vulkan: improve the implementation
cebtenzzre May 2, 2024
ce2164e
backend: also build CUDA backend by default
cebtenzzre May 2, 2024
011935e
backend: make CUDA build useful
cebtenzzre May 2, 2024
9e457bf
backend: add option to build ROCm backend
cebtenzzre May 2, 2024
c47900e
cuda: implement device enumeration
cebtenzzre May 6, 2024
fef041c
cmake: don't build CPU variant on Linux/Windows
cebtenzzre May 6, 2024
bbe5cc0
llmodel: add a backend field to LLModel::GPUDevice
cebtenzzre May 6, 2024
dc334ae
llmodel: select a backend, not a build variant
cebtenzzre May 6, 2024
b54151f
llmodel: list GPU devices from all backends
cebtenzzre May 6, 2024
d4feaeb
python: implement selectable GPU backend
cebtenzzre May 6, 2024
1a10587
chat: implement basic UI backend selection
cebtenzzre May 6, 2024
a9ffe5a
cmake: set RUNPATH of llamamodel-mainline-cuda correctly
cebtenzzre May 7, 2024
d30d5b2
backend: fix kompute-avxonly build
cebtenzzre May 7, 2024
16170f4
cuda: fix dependency bundling on Windows
cebtenzzre May 7, 2024
2417105
ci: install CUDA toolkit
cebtenzzre May 7, 2024
9e8f7c3
cuda: ignore libcuda.so.* on Linux
cebtenzzre May 7, 2024
9e54277
ci: install additional deps to make linuxdeployqt happy
cebtenzzre May 7, 2024
335b129
cmake: do not ship static llama library
cebtenzzre May 8, 2024
b14992c
llama.cpp: do not install kompute or fmt
cebtenzzre May 8, 2024
2b5d6d3
cmake: let linuxdeployqt handle CUDA deps on Linux
cebtenzzre May 8, 2024
831a146
cmake: install llmodel to lib/ on Linux and bin/ on Windows
cebtenzzre May 8, 2024
eec9ec9
cmake: do not install import libraries on Windows
cebtenzzre May 8, 2024
9eaa326
kompute: use slightly newer Vulkan headers to avoid installation
cebtenzzre May 8, 2024
004a262
llama.cpp: rebase onto latest master
cebtenzzre May 8, 2024
f1558d7
chat: specify backend of reported device
cebtenzzre May 8, 2024
fe78377
Merge branch 'main' into add-cuda-support
cebtenzzre May 8, 2024
0e3d90a
cmake: fix upstream spelling error
cebtenzzre May 8, 2024
530b224
chat: give the "force metal" option a chance of working
cebtenzzre May 8, 2024
2e0e4fc
chat: make device selection meaningful on macOS
cebtenzzre May 8, 2024
bb0402b
cmake: remove an old reference to libbert-*.dylib
cebtenzzre May 8, 2024
0069196
llama.cpp: sync with upstream for CUDA graphs
cebtenzzre May 9, 2024
ad0c3ea
Merge branch 'main' into add-cuda-support
cebtenzzre May 9, 2024
6394494
backend: use "cpu" impl if "kompute" is not found
cebtenzzre May 9, 2024
553fc89
python: fix documentation for device parameter
cebtenzzre May 9, 2024
035ea59
cmake: fix chat install if built without Kompute or CUDA
cebtenzzre May 9, 2024
85b9e2f
cmake: clearly indicate how to build without CUDA/Kompute on failure
cebtenzzre May 9, 2024
c9b6732
llmodel: fix compile errors
cebtenzzre May 9, 2024
41ce930
metal: copy default.metallib, not ggml-metal.metal
cebtenzzre May 9, 2024
14588bc
llamamodel: update model whitelist
cebtenzzre May 10, 2024
21bc8c8
build_and_run: mention compiler and CUDA
cebtenzzre May 11, 2024
0cbca27
settings: prefix vk devices with "Vulkan: ", update old names
cebtenzzre May 13, 2024
8dbe93d
kompute: fix device name leaks
cebtenzzre May 13, 2024
dbf38b2
chat: bump version to 2.8.0
cebtenzzre May 15, 2024
1c09b6d
Merge branch 'main' into add-cuda-support
cebtenzzre May 15, 2024
c7f8c93
python: update README to reflect CUDA build dependency
cebtenzzre May 15, 2024
5875d83
python: bump version for CUDA support
cebtenzzre May 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
cmake: do not ship static llama library
llama.cpp itself is unconditionally built as a static library.
Installing it with the GUI is pointless.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
cebtenzzre committed May 8, 2024

Unverified

This user has not yet uploaded their public signing key.
commit 335b12968780c305e9cd3007ed51e37bdcf90bd3
6 changes: 0 additions & 6 deletions gpt4all-chat/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -206,8 +206,6 @@ if (APPLE)
install(TARGETS gptj-cpu DESTINATION lib COMPONENT ${COMPONENT_NAME_MAIN})
install(TARGETS gptj-cpu-avxonly DESTINATION lib COMPONENT ${COMPONENT_NAME_MAIN})

install(TARGETS llama-mainline-cpu DESTINATION lib COMPONENT ${COMPONENT_NAME_MAIN})
install(TARGETS llama-mainline-cpu-avxonly DESTINATION lib COMPONENT ${COMPONENT_NAME_MAIN})
install(TARGETS llamamodel-mainline-cpu DESTINATION lib COMPONENT ${COMPONENT_NAME_MAIN})
install(TARGETS llamamodel-mainline-cpu-avxonly DESTINATION lib COMPONENT ${COMPONENT_NAME_MAIN})

@@ -218,10 +216,6 @@ else()
install(
TARGETS gptj-kompute
gptj-kompute-avxonly
llama-mainline-kompute
llama-mainline-kompute-avxonly
llama-mainline-cuda
llama-mainline-cuda-avxonly
llamamodel-mainline-kompute
llamamodel-mainline-kompute-avxonly
# llamamodel-mainline-cuda installed below