ggml : add ggml-common.h to deduplicate shared code #5940

ggerganov · 2024-03-08T11:36:56Z

Add ggml-common.h that contains common bits used in the different backends. The goal is to reduce code duplication. For now, it contains just the quantization related constants, but can potentially get extended with more things in the future

TODO:

Reuse quantum tables in SYCL backend
~~Try to move quantum block structs in ggml-common.h and reuse~~ (see ggml : reuse quantum structs across backends #5943)

ggml-ci

ggerganov · 2024-03-09T12:04:12Z

ggml-metal.metal

+#define GGML_COMMON_IMPL_METAL
+#include "ggml-common.h"


This could cause some trouble potentially, because it looks like the Metal compiler is looking for headers just in the folder where the binary is located and I can't seem to find a way to pass an include path or make it search in the current working dir.

For example, the following command no longer works:

make -j tests && ./tests/test-backend-ops -b Metal I ccache found, compilation results will be cached. Disable with LLAMA_NO_CCACHE. I llama.cpp build info: I UNAME_S: Darwin I UNAME_P: arm I UNAME_M: arm64 I CFLAGS: -I. -Icommon -D_XOPEN_SOURCE=600 -D_DARWIN_C_SOURCE -DNDEBUG -DGGML_USE_ACCELERATE -DACCELERATE_NEW_LAPACK -DACCELERATE_LAPACK_ILP64 -DGGML_USE_METAL -std=c11 -fPIC -O3 -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -pthread -Wunreachable-code-break -Wunreachable-code-return -Wdouble-promotion I CXXFLAGS: -std=c++11 -fPIC -O3 -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -pthread -Wunreachable-code-break -Wunreachable-code-return -Wmissing-prototypes -Wextra-semi -I. -Icommon -D_XOPEN_SOURCE=600 -D_DARWIN_C_SOURCE -DNDEBUG -DGGML_USE_ACCELERATE -DACCELERATE_NEW_LAPACK -DACCELERATE_LAPACK_ILP64 -DGGML_USE_METAL I NVCCFLAGS: -std=c++11 -O3 I LDFLAGS: -framework Accelerate -framework Foundation -framework Metal -framework MetalKit I CC: Apple clang version 15.0.0 (clang-1500.1.0.2.5) I CXX: Apple clang version 15.0.0 (clang-1500.1.0.2.5) Testing 2 backends Backend 1/2 (CPU) Skipping Backend 2/2 (Metal) ggml_metal_init: allocating ggml_metal_init: found device: Apple M2 Ultra ggml_metal_init: picking default device: Apple M2 Ultra ggml_metal_init: default.metallib not found, loading from source ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil ggml_metal_init: error: could not use bundle path to find ggml-metal.metal, falling back to trying cwd ggml_metal_init: loading 'ggml-metal.metal' ggml_metal_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "program_source:4:10: fatal error: 'ggml-common.h' file not found #include "ggml-common.h" ^~~~~~~~~~~~~~~ " UserInfo={NSLocalizedDescription=program_source:4:10: fatal error: 'ggml-common.h' file not found #include "ggml-common.h" ^~~~~~~~~~~~~~~ } GGML_ASSERT: tests/test-backend-ops.cpp:2269: backend != NULL Abort trap: 6

A workaround is to make a symlink:

ln -sfn ../ggml-common.h ./tests/

Yeah that's not very nice, it will also require the header file even with LLAMA_METAL_EMBED_LIBRARY. We could probably do something like cat ggml-common.h ggml-metal.metal > ggml-metal-bundled.metal during the build to avoid the include.

* ggml : add ggml-common.h to shared code ggml-ci * scripts : update sync scripts * sycl : reuse quantum tables ggml-ci * ggml : minor * ggml : minor * sycl : try to fix build

ggml : add ggml-common.h to shared code

e2a4760

ggml-ci

ggerganov marked this pull request as draft March 8, 2024 11:39

scripts : update sync scripts

fc427b7

ggerganov mentioned this pull request Mar 8, 2024

[SYCL] Add q3_s and q1_s #5886

Merged

ggerganov added 4 commits March 8, 2024 14:19

sycl : reuse quantum tables

b39b443

ggml-ci

ggml : minor

ddc6397

ggml : minor

a167b6d

sycl : try to fix build

1ea68ab

ggerganov marked this pull request as ready for review March 8, 2024 21:44

ggerganov requested a review from slaren March 8, 2024 21:44

slaren approved these changes Mar 9, 2024

View reviewed changes

ggerganov merged commit 8a3012a into master Mar 9, 2024
52 checks passed

ggerganov deleted the gg/ggml-common branch March 9, 2024 10:47

ggerganov commented Mar 9, 2024

View reviewed changes

ggerganov added a commit that referenced this pull request Mar 10, 2024

ggml : remove __constant__ specifier for CUDA tables (#5940)

bf47a5e

NeoZhangJianyu pushed a commit to NeoZhangJianyu/llama.cpp that referenced this pull request Mar 12, 2024

ggml : remove __constant__ specifier for CUDA tables (ggerganov#5940)

2163849

ggerganov mentioned this pull request Mar 12, 2024

metal : build metallib + fix embed path #6015

Merged

jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024

ggml : remove __constant__ specifier for CUDA tables (ggerganov#5940)

4f901df

LostRuins mentioned this pull request Mar 13, 2024

Model loading failed with --gpulayer 80 on Metal LostRuins/koboldcpp#744

Closed

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024

ggml : remove __constant__ specifier for CUDA tables (ggerganov#5940)

d66218b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml : add ggml-common.h to deduplicate shared code #5940

ggml : add ggml-common.h to deduplicate shared code #5940

ggerganov commented Mar 8, 2024 •

edited

Loading

ggerganov Mar 9, 2024

slaren Mar 9, 2024 •

edited

Loading

ggml : add ggml-common.h to deduplicate shared code #5940

ggml : add ggml-common.h to deduplicate shared code #5940

Conversation

ggerganov commented Mar 8, 2024 • edited Loading

ggerganov Mar 9, 2024

Choose a reason for hiding this comment

slaren Mar 9, 2024 • edited Loading

Choose a reason for hiding this comment

ggerganov commented Mar 8, 2024 •

edited

Loading

slaren Mar 9, 2024 •

edited

Loading