Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for AVXVNNI instruction set - Clang Compiler #11027

Merged
merged 4 commits into from
Dec 31, 2024

Conversation

Srihari-mcw
Copy link
Contributor

@Srihari-mcw Srihari-mcw commented Dec 31, 2024

The PR is to fix issues with the AVX_VNNI instruction set with the clang compiler. The updates were built across compilers and was seen to be building fine post the changes

Error seen :

image (21)

The performance was tested before and after changes and were found to be similar - Tested with Linux GCC 12.3

model size params backend threads test t/s speedup Commit id
llama 7B Q4_0 3.56 GiB 6.74 B CPU 14 pp 512 52.69 ± 0.15 0.02% 2a4e792
llama 7B Q4_0 3.56 GiB 6.74 B CPU 14 pp 512 52.68 ± 0.19 7909e858
llama 7B Q4_0 3.56 GiB 6.74 B CPU 14 tg 128 19.91 ± 0.25 0.55% 2a4e792
llama 7B Q4_0 3.56 GiB 6.74 B CPU 14 tg 128 19.80 ± 0.46 7909e858

The perplexity was tested for 32 chunks and was found to be the same for Q4_0 model before and after changes - 5.4993 +- 0.13676

Flags enabled - Flags:| CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | LLAMAFILE = 1 | OPENMP = 1 | AARCH64_REPACK = 1 |

Model - Meta LLama2 7B - https://huggingface.co/meta-llama/Llama-2-7b

@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Dec 31, 2024
@slaren
Copy link
Collaborator

slaren commented Dec 31, 2024

Thanks, this also fixes AVX VNNI with MSVC, so I have enabled it for MSVC as well.

ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp Outdated Show resolved Hide resolved
ggml/src/ggml-cpu/ggml-cpu-quants.c Outdated Show resolved Hide resolved
ggml/src/ggml-cpu/llamafile/sgemm.cpp Outdated Show resolved Hide resolved
@slaren slaren merged commit 0827b2c into ggerganov:master Dec 31, 2024
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants