Releases · teleprint-me/llama.cpp

31 Jan 00:04

553f1e4

b4600 Latest

Latest

`ci`: ccache for all github worfklows (#11516)

Assets 22

cudart-llama-bin-win-cu11.7-x64.zip

303 MB 2025-01-31T00:04:03Z
cudart-llama-bin-win-cu12.4-x64.zip

373 MB 2025-01-31T00:04:11Z
llama-b4600-bin-macos-arm64.zip

24.9 MB 2025-01-31T00:04:19Z
llama-b4600-bin-macos-x64.zip

26.6 MB 2025-01-31T00:04:20Z
llama-b4600-bin-ubuntu-x64.zip

28.7 MB 2025-01-31T00:04:21Z
llama-b4600-bin-win-avx-x64.zip

15.2 MB 2025-01-31T00:04:22Z
llama-b4600-bin-win-avx2-x64.zip

15.2 MB 2025-01-31T00:04:23Z
llama-b4600-bin-win-avx512-x64.zip

15.2 MB 2025-01-31T00:04:24Z
llama-b4600-bin-win-cuda-cu11.7-x64.zip

153 MB 2025-01-31T00:04:25Z
llama-b4600-bin-win-cuda-cu12.4-x64.zip

153 MB 2025-01-31T00:04:29Z
Source code (zip)

2025-01-30T22:01:06Z
Source code (tar.gz)

2025-01-30T22:01:06Z

26 Jan 02:56

github-actions

b4557

f35726c

b4557

build: apply MSVC /bigobj option to c/cpp files only (#11423)

Assets 23

25 Jan 02:54

github-actions

b4549

466ea66

b4549

CANN: Add Ascend CANN build ci (#10217)

* CANN: Add Ascend CANN build ci

* Update build.yml

* Modify cann image version

* Update build.yml

* Change to run on x86 system

* Update build.yml

* Update build.yml

* Modify format error

* Update build.yml

* Add 'Ascend NPU' label restrictions

* Exclude non PR event

Co-authored-by: Yuanhao Ji <jiyuanhao@apache.org>

* Update build.yml

---------

Co-authored-by: Yuanhao Ji <jiyuanhao@apache.org>

Assets 23

21 Jan 00:34

github-actions

b4519

80d0d6b

b4519

common : add -hfd option for the draft model (#11318)

* common : add -hfd option for the draft model

* cont : fix env var

* cont : more fixes

Assets 23

19 Jan 05:49

github-actions

b4508

a1649cc

b4508

Adding linenoise.cpp to llama-run (#11252)

This is a fork of linenoise that is C++17 compatible. I intend on
adding it to llama-run so we can do things like traverse prompt
history via the up and down arrows:

https://github.com/ericcurtin/linenoise.cpp

Signed-off-by: Eric Curtin <ecurtin@redhat.com>

Assets 23

18 Jan 09:15

github-actions

b4503

44e18ef

b4503

vulkan: fix coopmat2 flash attention for non-contiguous inputs (#11281)

Add code similar to mul_mm_cm2 to force alignment of strides, to avoid
a performance regression.

Add noncontiguous FA tests in test-backend-ops.

Fixes #11268.

Assets 23

06 Jan 23:53

github-actions

b4431

dc7cef9

b4431

llama-run : fix context size (#11094)

Set `n_ctx` equal to `n_batch` in `Opt` class. Now context size is
a more reasonable 2048.

Signed-off-by: Eric Curtin <ecurtin@redhat.com>

Assets 23

02 Jan 03:16

github-actions

b4404

0827b2c

b4404

ggml : fixes for AVXVNNI instruction set with MSVC and Clang (#11027)

* Fixes for clang AVX VNNI

* enable AVX VNNI and alder lake build for MSVC

* Apply suggestions from code review

---------

Co-authored-by: slaren <slarengh@gmail.com>

Assets 23

23 Dec 02:12

github-actions

b4381

b92a14a

b4381

llama : support InfiniAI Megrez 3b (#10893)

* Support InfiniAI Megrez 3b

* Fix tokenizer_clean_spaces for megrez

Assets 23

17 Dec 20:55

github-actions

b4349

081b29b

b4349

tests: add tests for GGUF (#10830)

Assets 23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: teleprint-me/llama.cpp

b4600

b4557

b4549

b4519

b4508

b4503

b4431

b4404

b4381

b4349