Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from ggerganov:master #32

Closed
wants to merge 10 commits into from

Conversation

pull[bot]
Copy link

@pull pull bot commented Feb 7, 2024

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

Sang-Kil Park and others added 8 commits February 6, 2024 23:28
* include total "num_slots" in default_generation_settings_for_props

* cleanup total_slots return value in /props endpoint

* update /props endpoint docs with total_slots

* remove num_slots from default_generation_settings_for_props

* update /props endpoint section
* support minicpm arch.

* fix tab/space typo.

* convert minicpm model via convert-hf-gguf.py

* try to make tokenizer work

* fix bug for quantize minicpm

* fix for flake8 lint

* remove convert-minicpm.py

* fix for editorconfig

* correct minicpm model type (size)

* constants expanded for minicpm

* Minor change of the constant names for minicpm
* first cleanup, update everything to Llama 2 and remove outdated content

* Delete SHA256SUMS

* make build instructions generic

* recommend Q4_K_M quantization method

* Update README.md
* Initial Vulkan multi-gpu implementation

Move most global variables into backend context

* Add names to backend device functions

* Add further missing cleanup code

* Reduce code duplication in tensor split layer assignment

* generalize LLAMA_SPLIT_LAYER for all backends, do not expose device count and memory in llama.h

* Only do device info print in the beginning and initialize one backend for cpu assist

Add missing cleanup code

* Rework backend memory management to make sure devices and buffers get properly allocated and freed

* Rename cpu assist free function

---------

Co-authored-by: slaren <slarengh@gmail.com>
* llava-cli: tokenize special tokens in prompt

* llava-cli: use the escape CLI argument, remove incomplete separate escaping process
@pull pull bot added the ⤵️ pull label Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants