Skip to content

Releases: tinglou/llama.cpp

b4151

22 Nov 09:05
c18610b
Compare
Choose a tag to compare
CANN: Support Ascend310P to accelerate F32 and F16 Model (#10216)

* CANN Support Ascend310P to accelerate F32 and F16 Model

* Add compile option soc type macro ASCEND_310P to ggml-cann lib

* Remove unused code

* Remove the ascend soc_type hard code compile option in CMakelist.txt

b4099

16 Nov 13:45
f245cc2
Compare
Choose a tag to compare
scripts : fix missing key in compare-llama-bench.py (#10332)

b4066

11 Nov 15:39
b0cefea
Compare
Choose a tag to compare
metal : more precise Q*K in FA vec kernel (#10247)

b3798

21 Sep 12:31
41f4778
Compare
Choose a tag to compare
Update CUDA graph on scale change plus clear nodes/params  (#9550)

* Avoid using saved CUDA graph if scale changes and reset nodes/params on update

Fixes https://github.com/ggerganov/llama.cpp/issues/9451

* clear before resize

b3755

15 Sep 07:11
822b632
Compare
Choose a tag to compare
ggml : ggml_type_name return "NONE" for invalid values (#9458)

When running on Windows, the quantization utility attempts to print the types that are not set which leads to a crash.

b3668

05 Sep 02:12
bdf314f
Compare
Choose a tag to compare
llama-bench : fix NUL terminators in CPU name (#9313)

b3639

28 Aug 03:19
Compare
Choose a tag to compare
vulkan : fix build (#0)

ggml-ci

b3358

10 Jul 02:35
a59f8fd
Compare
Choose a tag to compare
Server: Enable setting default sampling parameters via command-line (…

b3277

02 Jul 02:56
5fac350
Compare
Choose a tag to compare
Fix gemma2 tokenizer convert (#8244)

* fix gemma2 tokenizer convert

* remove scores

* improve code, fix new line issue

b2837

10 May 07:54
d11afd6
Compare
Choose a tag to compare
llava : fix moondream support (#7163)

* Revert "Revert "llava : add support for moondream vision language model (#6899)""

This reverts commit 9da243b36ac0b9d609adfaaa4c8f1cc8c592f737.

* Fix num_positions and embeddings initialization