[pull] master from ggerganov:master #32

pull · 2024-02-07T09:37:22Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

* include total "num_slots" in default_generation_settings_for_props * cleanup total_slots return value in /props endpoint * update /props endpoint docs with total_slots * remove num_slots from default_generation_settings_for_props * update /props endpoint section

* support minicpm arch. * fix tab/space typo. * convert minicpm model via convert-hf-gguf.py * try to make tokenizer work * fix bug for quantize minicpm * fix for flake8 lint * remove convert-minicpm.py * fix for editorconfig * correct minicpm model type (size) * constants expanded for minicpm * Minor change of the constant names for minicpm

* first cleanup, update everything to Llama 2 and remove outdated content * Delete SHA256SUMS * make build instructions generic * recommend Q4_K_M quantization method * Update README.md

* Initial Vulkan multi-gpu implementation Move most global variables into backend context * Add names to backend device functions * Add further missing cleanup code * Reduce code duplication in tensor split layer assignment * generalize LLAMA_SPLIT_LAYER for all backends, do not expose device count and memory in llama.h * Only do device info print in the beginning and initialize one backend for cpu assist Add missing cleanup code * Rework backend memory management to make sure devices and buffers get properly allocated and freed * Rename cpu assist free function --------- Co-authored-by: slaren <slarengh@gmail.com>

* llava-cli: tokenize special tokens in prompt * llava-cli: use the escape CLI argument, remove incomplete separate escaping process

Sang-Kil Park and others added 8 commits February 6, 2024 23:28

convert : fix TypeError on GPT-2 vocab.json (#5288)

f68664a

readme : update ui list (#5354)

9a697d8

readme : modernize (#5379)

ed0bf32

* first cleanup, update everything to Llama 2 and remove outdated content * Delete SHA256SUMS * make build instructions generic * recommend Q4_K_M quantization method * Update README.md

llava-cli : always tokenize special tokens (#5382)

0ef46da

* llava-cli: tokenize special tokens in prompt * llava-cli: use the escape CLI argument, remove incomplete separate escaping process

[SYCL] update install make by w64devkit (#5297)

10afa6f

pull bot added the ⤵️ pull label Feb 7, 2024

JohannesGaessler and others added 2 commits February 7, 2024 12:40

CUDA: fixed mmvq kernel for bs 2,3,4 and -sm row (#5386)

aa7ab99

Add Ava in the list of llama.cpp UIs (#4362)

b906596

teleprint-me closed this Feb 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from ggerganov:master #32

[pull] master from ggerganov:master #32

pull bot commented Feb 7, 2024 •

edited

Loading

[pull] master from ggerganov:master #32

[pull] master from ggerganov:master #32

Conversation

pull bot commented Feb 7, 2024 • edited Loading

pull bot commented Feb 7, 2024 •

edited

Loading