llama.cpp sync for SVE support for Q4_K_Ms #109

a-ghorbani · 2025-01-16T20:28:33Z

Apologies, I know you just synced a few days ago, but the numbers for this PR look amazing:
ggerganov/llama.cpp#11227 (comment)

Vali-98 · 2025-01-17T14:45:33Z

Hey there, wanted to ask if you actually tested this on device?

As far as I know, SVE isn't not actually implemented by most Android mobile SOCs, and the few which do have limited compatibility (Pixel devices are the biggest offender).

Most SVE implementations seem to be for server-grade ARM, like Graviton.

a-ghorbani · 2025-01-19T12:24:51Z

good point. I'll give it a try and if I see any improvements, at least on any devices I have, I'll report here.

a-ghorbani · 2025-01-23T13:53:41Z

@Vali-98 jup, no improvement on TG or PP on Pixel 9.
I am no expert in this, but seeing sve and sve2 features in the CPU was hoping it would support.

Pixel 9 features:

 Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh bti ecv afp wfxt

Vali-98 · 2025-01-23T14:25:52Z

features in the CPU was hoping it would support.

Though I don't own a Pixel device, I've read elsewhere that the SVE support was spotty and incomplete. I don't think there is much left to be gained for CPU acceleration on android.

Our best bet would be someone implementing Qualcomm's hexagon APIs for NPUs to llama.cpp, similar to what has been done with PowerServe: https://github.com/powerserve-project/PowerServe

a-ghorbani mentioned this issue Jan 22, 2025

feat: sync llama.cpp #110

Merged

jhen0409 closed this as completed in #110 Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama.cpp sync for SVE support for Q4_K_Ms #109

llama.cpp sync for SVE support for Q4_K_Ms #109

a-ghorbani commented Jan 16, 2025

Vali-98 commented Jan 17, 2025

a-ghorbani commented Jan 19, 2025

a-ghorbani commented Jan 23, 2025

Vali-98 commented Jan 23, 2025

llama.cpp sync for SVE support for Q4_K_Ms #109

llama.cpp sync for SVE support for Q4_K_Ms #109

Comments

a-ghorbani commented Jan 16, 2025

Vali-98 commented Jan 17, 2025

a-ghorbani commented Jan 19, 2025

a-ghorbani commented Jan 23, 2025

Vali-98 commented Jan 23, 2025