Releases · OpenBMB/llama.cpp

04 Sep 04:31

7605ae7

b3662

flake.lock: Update (#9261)

Flake lock file updates:

• Updated input 'flake-parts':
    'github:hercules-ci/flake-parts/8471fe90ad337a8074e957b69ca4d0089218391d?narHash=sha256-XOQkdLafnb/p9ij77byFQjDf5m5QYl9b2REiVClC%2Bx4%3D' (2024-08-01)
  → 'github:hercules-ci/flake-parts/af510d4a62d071ea13925ce41c95e3dec816c01d?narHash=sha256-ODYRm8zHfLTH3soTFWE452ydPYz2iTvr9T8ftDMUQ3E%3D' (2024-08-30)
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/c374d94f1536013ca8e92341b540eba4c22f9c62?narHash=sha256-Z/ELQhrSd7bMzTO8r7NZgi9g5emh%2BaRKoCdaAv5fiO0%3D' (2024-08-21)
  → 'github:NixOS/nixpkgs/71e91c409d1e654808b2621f28a327acfdad8dc2?narHash=sha256-GnR7/ibgIH1vhoy8cYdmXE6iyZqKqFxQSVkFgosBh6w%3D' (2024-08-28)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Assets 19

03 Sep 08:16

github-actions

b3660

b69a480

b3660

readme : refactor API section + remove old hot topics

Assets 19

30 Aug 08:08

github-actions

b3645

7ea8d80

b3645

llava : the function "clip" should be int (#9237)

Assets 19

22 Aug 10:22

github-actions

b3615

1731d42

b3615

[SYCL] Add oneDNN primitive support (#9091)

* add onednn

* add sycl_f16

* add dnnl stream

* add engine map

* use dnnl for intel only

* use fp16fp16fp16

* update doc

Assets 19

10 Aug 09:43

github-actions

b3621

fc1c860

b3621

Merge branch 'prepare-PR-of-minicpm-v2.6' into master

Assets 20

24 Jun 04:04

github-actions

b3209

95f57bb

b3209

ggml : remove ggml_task_type and GGML_PERF (#8017)

* ggml : remove ggml_task_type and GGML_PERF

* check abort_callback on main thread only

* vulkan : remove usage of ggml_compute_params

* remove LLAMA_PERF

Assets 20

04 Jun 07:37

github-actions

b3078

bde7cd3

b3078

llama : offload to RPC in addition to other backends (#7640)

* llama : offload to RPC in addition to other backends

* - fix copy_tensor being called on the src buffer instead of the dst buffer

- always initialize views in the view_src buffer

- add RPC backend to Makefile build

- add endpoint to all RPC object names

* add rpc-server to Makefile

* Update llama.cpp

Co-authored-by: slaren <slarengh@gmail.com>

---------

Co-authored-by: slaren <slarengh@gmail.com>

Assets 21

28 May 20:05

github-actions

b3026

5442939

b3026

llama : support small Granite models (#7481)

* Add optional MLP bias for Granite models

Add optional MLP bias for ARCH_LLAMA to support Granite models.
Partially addresses ggerganov/llama.cpp/issues/7116
Still needs some more changes to properly support Granite.

* llama: honor add_space_prefix from the model configuration

propagate the add_space_prefix configuration from the HF model
configuration to the gguf file and honor it with the gpt2 tokenizer.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

* llama: add support for small granite models

it works only for the small models 3b and 8b.

The convert-hf-to-gguf.py script uses the vocabulary size of the
granite models to detect granite and set the correct configuration.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

---------

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Co-authored-by: Steffen Roecker <sroecker@redhat.com>

Assets 21

28 May 20:04

github-actions

b3025

56411a9

b3025

vulkan: properly initialize vulkan devices for LLAMA_SPLIT_MODE_NONE …

Assets 21

23 May 11:53

github-actions

b2979

9b82476

b2979

Add missing inference support for GPTNeoXForCausalLM (Pythia and GPT-…

Assets 21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: OpenBMB/llama.cpp

b3662

b3660

b3645

b3615

b3621

b3209

b3078

b3026

b3025

b2979