Releases: tinglou/llama.cpp
Releases · tinglou/llama.cpp
b4151
CANN: Support Ascend310P to accelerate F32 and F16 Model (#10216) * CANN Support Ascend310P to accelerate F32 and F16 Model * Add compile option soc type macro ASCEND_310P to ggml-cann lib * Remove unused code * Remove the ascend soc_type hard code compile option in CMakelist.txt
b4099
scripts : fix missing key in compare-llama-bench.py (#10332)
b4066
metal : more precise Q*K in FA vec kernel (#10247)
b3798
Update CUDA graph on scale change plus clear nodes/params (#9550) * Avoid using saved CUDA graph if scale changes and reset nodes/params on update Fixes https://github.com/ggerganov/llama.cpp/issues/9451 * clear before resize
b3755
ggml : ggml_type_name return "NONE" for invalid values (#9458) When running on Windows, the quantization utility attempts to print the types that are not set which leads to a crash.
b3668
llama-bench : fix NUL terminators in CPU name (#9313)
b3639
vulkan : fix build (#0) ggml-ci
b3358
Server: Enable setting default sampling parameters via command-line (…
b3277
Fix gemma2 tokenizer convert (#8244) * fix gemma2 tokenizer convert * remove scores * improve code, fix new line issue
b2837
llava : fix moondream support (#7163) * Revert "Revert "llava : add support for moondream vision language model (#6899)"" This reverts commit 9da243b36ac0b9d609adfaaa4c8f1cc8c592f737. * Fix num_positions and embeddings initialization