metal : fix F32 accumulation in FA vec kernel #10232

ggerganov · 2024-11-09T09:18:47Z

Was accidentally accumulating Q*K results in F16 instead of F32.

./llama-cli -m ./models/qwen2.5-1.5b-coder/ggml-model-f16.gguf -s 1 -p "I believe the meaning of life is to" -n 32 -fa

...

I believe the meaning of life is to keep it simple.

@@@@@@@@@@@@@@@@@@@@@@@@@@@@

metal : fix F32 accumulation in FA vec kernel

ced3be9

ggerganov merged commit bb38cdd into master Nov 9, 2024
49 checks passed

ggerganov deleted the gg/metal-fa-vec-fix-prec branch November 9, 2024 09:52

ggerganov mentioned this pull request Nov 10, 2024

metal : more precise Q*K in FA vec kernel #10247

Merged

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024

metal : fix F32 accumulation in FA vec kernel (ggerganov#10232)

8340912

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024

metal : fix F32 accumulation in FA vec kernel (ggerganov#10232)

5cabf58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

metal : fix F32 accumulation in FA vec kernel #10232

metal : fix F32 accumulation in FA vec kernel #10232

ggerganov commented Nov 9, 2024

metal : fix F32 accumulation in FA vec kernel #10232

metal : fix F32 accumulation in FA vec kernel #10232

Conversation

ggerganov commented Nov 9, 2024