Skip to content

Releases: l3utterfly/llama.cpp

b4519

21 Jan 05:58
80d0d6b
Compare
Choose a tag to compare
common : add -hfd option for the draft model (#11318)

* common : add -hfd option for the draft model

* cont : fix env var

* cont : more fixes

b4393

28 Dec 08:11
d79d8f3
Compare
Choose a tag to compare
vulkan: multi-row k quants (#10846)

* multi row k quant shaders!

* better row selection

* more row choices

* readjust row selection

* rm_kq=2 by default

b4302

11 Dec 07:41
43041d2
Compare
Choose a tag to compare
ggml: load all backends from a user-provided search path (#10699)

* feat: load all backends from a user-provided search path

* fix: Windows search path

* refactor: rename `ggml_backend_load_all_in_search_path` to `ggml_backend_load_all_from_path`

* refactor: rename `search_path` to `dir_path`

* fix: change `NULL` to `nullptr`

Co-authored-by: Diego Devesa <slarengh@gmail.com>

* fix: change `NULL` to `nullptr`

---------

Co-authored-by: Diego Devesa <slarengh@gmail.com>

b4219

29 Nov 10:53
266b851
Compare
Choose a tag to compare
sycl : Reroute permuted mul_mats through oneMKL (#10408)

This PR fixes the failing MUL_MAT tests for the sycl backend.

b4200

27 Nov 12:46
46c69e0
Compare
Choose a tag to compare
ci : faster CUDA toolkit installation method and use ccache (#10537)

* ci : faster CUDA toolkit installation method and use ccache

* remove fetch-depth

* only pack CUDA runtime on master

b4098

16 Nov 07:52
772703c
Compare
Choose a tag to compare
vulkan: Optimize some mat-vec mul quant shaders (#10296)

Compute two result elements per workgroup (for Q{4,5}_{0,1}). This reuses
the B loads across the rows and also reuses some addressing calculations.
This required manually partially unrolling the loop, since the compiler
is less willing to unroll outer loops.

Add bounds-checking on the last iteration of the loop. I think this was at
least partly broken before.

Optimize the Q4_K shader to vectorize most loads and reduce the number of
bit twiddling instructions.

b4033

05 Nov 08:45
a9e8a9a
Compare
Choose a tag to compare
ggml : fix arch check in bf16_to_fp32 (#10164)

b3982

27 Oct 09:02
cc2983d
Compare
Choose a tag to compare
sync : ggml

b3902

10 Oct 03:40
c81f3bb
Compare
Choose a tag to compare
cmake : do not build common library by default when standalone (#9804)

Layla v3.3.0

18 Jan 04:15
Compare
Choose a tag to compare

llama.cpp used in the Layla v3.3.0 release