Releases · teleprint-me/llama.cpp

23 Jan 17:20

b1960

26d6076

b1960

metal : disable support for MUL_MAT F32 x F16

Assets 12

23 Jan 02:05

github-actions

b1954

011e8ec

b1954

llama : fix not enough space in buffer with Qwen (#5086)

Assets 12

16 Jan 18:20

github-actions

b1893

bee938d

b1893

nix: remove nixConfig from flake.nix (#4984)

Assets 12

16 Jan 17:15

github-actions

b1886

862f5e4

b1886

android : introduce starter project example (#4926)

* Introduce starter project for Android

Based on examples/llama.swiftui.

* Add github workflow

* Set NDK version

* Only build arm64-v8a in CI

* Sync bench code

* Rename CI prop to skip-armeabi-v7a

* Remove unused tests

Assets 12

16 Jan 03:57

github-actions

b1879

3e5ca79

b1879

pass cpu-architecture arguments only to host code (C;C++) (#4943)

Assets 12

15 Jan 17:03

github-actions

b1878

4483396

b1878

llama : apply classifier-free guidance to logits directly (#4951)

Assets 12

15 Jan 07:54

github-actions

b1874

4a3156d

b1874

CUDA: faster dequantize kernels for Q4_0 and Q4_1 (#4938)

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

Assets 12

14 Jan 16:59

github-actions

b1873

a836c8f

b1873

llama : fix missing quotes (#4937)

Assets 12

14 Jan 05:10

github-actions

b1863

76484fb

b1863

sync : ggml

Assets 12

12 Jan 22:28

github-actions

b1848

de473f5

b1848

sync : ggml

Assets 12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: teleprint-me/llama.cpp

b1960

b1954

b1893

b1886

b1879

b1878

b1874

b1873

b1863

b1848