Skip to content

Releases: teleprint-me/llama.cpp

b3448

23 Jul 17:12
b841d07
Compare
Choose a tag to compare
server : fix URL.parse in the UI (#8646)

b3441

22 Jul 22:24
081fe43
Compare
Choose a tag to compare
llama : fix codeshell support (#8599)

* llama : fix codeshell support

* llama : move codeshell after smollm below to respect the enum order

b3428

20 Jul 21:45
69c487f
Compare
Choose a tag to compare
CUDA: MMQ code deduplication + iquant support (#8495)

* CUDA: MMQ code deduplication + iquant support

* 1 less parallel job for CI build

b3423

19 Jul 17:09
87e397d
Compare
Choose a tag to compare
ggml : fix quant dot product with odd number of blocks (#8549)

* ggml : fix iq4_nl dot product with odd number of blocks

* ggml : fix odd blocks for ARM_NEON (#8556)

* ggml : fix iq4_nl dot product with odd number of blocks

* ggml : fix q4_1

* ggml : fix q5_0

* ggml : fix q5_1

* ggml : fix iq4_nl metal

ggml-ci

* ggml : fix q4_0

* ggml : fix q8_0

ggml-ci

* ggml : remove special Q4_0 code for first 2 blocks

* ggml : fix sumf redefinition

---------

Co-authored-by: slaren <slarengh@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

b3422

19 Jul 15:54
57b1d4f
Compare
Choose a tag to compare
convert-*.py: remove add_name from ChatGLMModel class (#8590)

b3416

19 Jul 06:12
a15ef8f
Compare
Choose a tag to compare
CUDA: fix partial offloading for ne0 % 256 != 0 (#8572)

b3405

16 Jul 21:43
5e116e8
Compare
Choose a tag to compare
make/cmake: add missing force MMQ/cuBLAS for HIP (#8515)

b3399

15 Jul 19:21
4db8f60
Compare
Choose a tag to compare
fix ci (#8494)

b3387

14 Jul 04:50
fa79495
Compare
Choose a tag to compare
llama : fix pre-tokenization of non-special added tokens (#8228)

* llama : fix mpt and olmo pre-tokenizer

* llama : pre-tokenize non-special user-defined tokens first

* llama : fix detection of control-like user-defined tokens

* convert_hf : identify which user-defined tokens are control tokens

Only used in _set_vocab_gpt2() for now.

* convert_hf : identify more added control tokens for SPM tokenziers

This makes Gemma and Gemma-2 tokenize pretty much EVERYTHING correctly,
including HTML tags and consecutive spaces,
but it unfortunately requires model re-conversion.

There seems to be a weird behavior of the HF tokenizer for Gemma,
which prefers to use the 16-space token over more lengthy space tokens,
while using the SentencePiece tokenizer does not do this.
(the implementation in llama.cpp has the same behavior as SentencePiece)

* llama : fix wrong pre-tokenization of byte tokens

* llama : fix Viking pre-tokenizer regex

The order was previously wrong, which caused errors in some tests.

* llama : fix command-r detokenization

* convert_hf : reduce usages of the UNKNOWN token type

* llama : add UNKNOWN tokens in the special tokens cache

* convert_hf : reduce usages of UNKNOWN for InternLM2

This makes the changes from #8321 more consistent
with the other changes made here.

* test-tokenizer-random : reduce potential confilcts with #8379

* test-tokenizer-random : add a failing edge case for falcon

b3386

13 Jul 19:18
17eb6aa
Compare
Choose a tag to compare
vulkan : cmake integration (#8119)

* Add Vulkan to CMake pkg

* Add Sycl to CMake pkg

* Add OpenMP to CMake pkg

* Split generated shader file into separate translation unit

* Add CMake target for Vulkan shaders

* Update README.md

* Add make target for Vulkan shaders

* Use pkg-config to locate vulkan library

* Add vulkan SDK dep to ubuntu-22-cmake-vulkan workflow

* Clean up tabs

* Move sudo to apt-key invocation

* Forward GGML_EXTRA_LIBS to CMake config pkg

* Update vulkan obj file paths

* Add shaderc to nix pkg

* Add python3 to Vulkan nix build

* Link against ggml in cmake pkg

* Remove Python dependency from Vulkan build

* code review changes

* Remove trailing newline

* Add cflags from pkg-config to fix w64devkit build

* Update README.md

* Remove trailing whitespace

* Update README.md

* Remove trailing whitespace

* Fix doc heading

* Make glslc required Vulkan component

* remove clblast from nix pkg