-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make updates to fix issues with clang-cl builds while using AVX512 flags #10314
Conversation
Having a clang windows build for x86/x64 (especially with the Ninja Generator) as part of the release could help windows users based on the testing above. @slaren, could you please share your thoughts on the same? Thanks |
Sure, but I wonder if performance would be better with gcc/mingw64. |
Tested the repo with llama.cpp build with w64devkit gcc with make (Pls see table for commit details) and observed text generation is at a disadvantage with the build comparing that to results shared above GCC 12.2 Prompt Processing
Text Generation
|
I think this is because the OpenMP implementation in MinGW does not perform very well. In my tests, MinGW with |
Hi @slaren , Post the latest changes with Q4_0 model, the model was tested with both clang and GCC (both openmp disabled) and observed that clang, for Q4_0, gives much better performance post these changes (GCC also has gains). Prompt Processing
Text Generation
Currently the build.yml script has provisions for windows build with msvc (Link) Can we also add x64-win-llvm as part of the same? Can you please share your insights on future plans for the CI/CD pipeline? Thanks |
My goal is to build with |
Fix : Requires -mavx512vnni, -mavx512vbmi and -mavx512bf16 as part of compile options
Also,
Visual Studio Generator, Clang-Cl (17.0.3) - For which fixes are given
Ninja Generator, Clang.exe/Clang++.exe (17.0.3)
Both the clang builds are done without the openmp dependencies
The tests were conducted in AMD Raphael 7600X which supports the following flags (The same were kept on in our windows tests) :
AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | AVX512_BF16 = 1 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1|
Prompt Processing
Text Generation
Tests done with Meta Llama2 7B model
Having a clang windows build (especially with the Ninja Generator) as part of the release could help windows users. Authors/Maintainers of llama.cpp, could you please share your thoughts on the same? Thanks