Releases · tinglou/llama.cpp

22 Nov 09:05

c18610b

b4151

CANN: Support Ascend310P to accelerate F32 and F16 Model (#10216)

* CANN Support Ascend310P to accelerate F32 and F16 Model

* Add compile option soc type macro ASCEND_310P to ggml-cann lib

* Remove unused code

* Remove the ascend soc_type hard code compile option in CMakelist.txt

Assets 21

16 Nov 13:45

github-actions

b4099

f245cc2

b4099

scripts : fix missing key in compare-llama-bench.py (#10332)

Assets 21

11 Nov 15:39

github-actions

b4066

b0cefea

b4066

metal : more precise Q*K in FA vec kernel (#10247)

Assets 22

21 Sep 12:31

github-actions

b3798

41f4778

b3798

Update CUDA graph on scale change plus clear nodes/params  (#9550)

* Avoid using saved CUDA graph if scale changes and reset nodes/params on update

Fixes https://github.com/ggerganov/llama.cpp/issues/9451

* clear before resize

Assets 22

15 Sep 07:11

github-actions

b3755

822b632

b3755

ggml : ggml_type_name return "NONE" for invalid values (#9458)

When running on Windows, the quantization utility attempts to print the types that are not set which leads to a crash.

Assets 19

05 Sep 02:12

github-actions

b3668

bdf314f

b3668

llama-bench : fix NUL terminators in CPU name (#9313)

Assets 19

28 Aug 03:19

github-actions

b3639

20f1789

b3639

vulkan : fix build (#0)

ggml-ci

Assets 19

10 Jul 02:35

github-actions

b3358

a59f8fd

b3358

Server: Enable setting default sampling parameters via command-line (…

Assets 20

02 Jul 02:56

github-actions

b3277

5fac350

b3277

Fix gemma2 tokenizer convert (#8244)

* fix gemma2 tokenizer convert

* remove scores

* improve code, fix new line issue

Assets 20

10 May 07:54

github-actions

b2837

d11afd6

b2837

llava : fix moondream support (#7163)

* Revert "Revert "llava : add support for moondream vision language model (#6899)""

This reverts commit 9da243b36ac0b9d609adfaaa4c8f1cc8c592f737.

* Fix num_positions and embeddings initialization

Assets 19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: tinglou/llama.cpp

b4151

b4099

b4066

b3798

b3755

b3668

b3639

b3358

b3277

b2837