Releases · teleprint-me/llama.cpp

23 Jul 17:12

b841d07

b3448

server : fix URL.parse in the UI (#8646)

Assets 20

22 Jul 22:24

github-actions

b3441

081fe43

b3441

llama : fix codeshell support (#8599)

* llama : fix codeshell support

* llama : move codeshell after smollm below to respect the enum order

Assets 20

20 Jul 21:45

github-actions

b3428

69c487f

b3428

CUDA: MMQ code deduplication + iquant support (#8495)

* CUDA: MMQ code deduplication + iquant support

* 1 less parallel job for CI build

Assets 20

19 Jul 17:09

github-actions

b3423

87e397d

b3423

ggml : fix quant dot product with odd number of blocks (#8549)

* ggml : fix iq4_nl dot product with odd number of blocks

* ggml : fix odd blocks for ARM_NEON (#8556)

* ggml : fix iq4_nl dot product with odd number of blocks

* ggml : fix q4_1

* ggml : fix q5_0

* ggml : fix q5_1

* ggml : fix iq4_nl metal

ggml-ci

* ggml : fix q4_0

* ggml : fix q8_0

ggml-ci

* ggml : remove special Q4_0 code for first 2 blocks

* ggml : fix sumf redefinition

---------

Co-authored-by: slaren <slarengh@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Assets 19

19 Jul 15:54

github-actions

b3422

57b1d4f

b3422

convert-*.py: remove add_name from ChatGLMModel class (#8590)

Assets 20

19 Jul 06:12

github-actions

b3416

a15ef8f

b3416

CUDA: fix partial offloading for ne0 % 256 != 0 (#8572)

Assets 20

16 Jul 21:43

github-actions

b3405

5e116e8

b3405

make/cmake: add missing force MMQ/cuBLAS for HIP (#8515)

Assets 20

15 Jul 19:21

github-actions

b3399

4db8f60

b3399

fix ci (#8494)

Assets 20

14 Jul 04:50

github-actions

b3387

fa79495

b3387

llama : fix pre-tokenization of non-special added tokens (#8228)

* llama : fix mpt and olmo pre-tokenizer

* llama : pre-tokenize non-special user-defined tokens first

* llama : fix detection of control-like user-defined tokens

* convert_hf : identify which user-defined tokens are control tokens

Only used in _set_vocab_gpt2() for now.

* convert_hf : identify more added control tokens for SPM tokenziers

This makes Gemma and Gemma-2 tokenize pretty much EVERYTHING correctly,
including HTML tags and consecutive spaces,
but it unfortunately requires model re-conversion.

There seems to be a weird behavior of the HF tokenizer for Gemma,
which prefers to use the 16-space token over more lengthy space tokens,
while using the SentencePiece tokenizer does not do this.
(the implementation in llama.cpp has the same behavior as SentencePiece)

* llama : fix wrong pre-tokenization of byte tokens

* llama : fix Viking pre-tokenizer regex

The order was previously wrong, which caused errors in some tests.

* llama : fix command-r detokenization

* convert_hf : reduce usages of the UNKNOWN token type

* llama : add UNKNOWN tokens in the special tokens cache

* convert_hf : reduce usages of UNKNOWN for InternLM2

This makes the changes from #8321 more consistent
with the other changes made here.

* test-tokenizer-random : reduce potential confilcts with #8379

* test-tokenizer-random : add a failing edge case for falcon

Assets 20

13 Jul 19:18

github-actions

b3386

17eb6aa

b3386

vulkan : cmake integration (#8119)

* Add Vulkan to CMake pkg

* Add Sycl to CMake pkg

* Add OpenMP to CMake pkg

* Split generated shader file into separate translation unit

* Add CMake target for Vulkan shaders

* Update README.md

* Add make target for Vulkan shaders

* Use pkg-config to locate vulkan library

* Add vulkan SDK dep to ubuntu-22-cmake-vulkan workflow

* Clean up tabs

* Move sudo to apt-key invocation

* Forward GGML_EXTRA_LIBS to CMake config pkg

* Update vulkan obj file paths

* Add shaderc to nix pkg

* Add python3 to Vulkan nix build

* Link against ggml in cmake pkg

* Remove Python dependency from Vulkan build

* code review changes

* Remove trailing newline

* Add cflags from pkg-config to fix w64devkit build

* Update README.md

* Remove trailing whitespace

* Update README.md

* Remove trailing whitespace

* Fix doc heading

* Make glslc required Vulkan component

* remove clblast from nix pkg

Assets 20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: teleprint-me/llama.cpp

b3448

b3441

b3428

b3423

b3422

b3416

b3405

b3399

b3387

b3386