Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable KV cache shifting automatically for unsupported models #11053

Merged
merged 2 commits into from
Jan 3, 2025

Conversation

MollySophia
Copy link
Collaborator

Disable KV cache shifting automatically for unsupported models instead of exiting directly.

This makes it easier for models that doesn't support KV cache shifting.
Currently in arg.cpp --no-context-shift is only enabled in LLAMA_EXAMPLE_MAIN, LLAMA_EXAMPLE_SERVER, LLAMA_EXAMPLE_IMATRIX, LLAMA_EXAMPLE_PERPLEXITY. As a result, for example, using llama-parallel with recurrent models will fail with message indicating that context-shift is not supported. But --no-context-shift isn't an available parameter for llama-parallel.

instead of exiting directly

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
common/common.cpp Outdated Show resolved Hide resolved
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
@ggerganov ggerganov merged commit 4b0c638 into ggerganov:master Jan 3, 2025
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants