Disable KV cache shifting automatically for unsupported models #11053

MollySophia · 2025-01-03T08:36:41Z

Disable KV cache shifting automatically for unsupported models instead of exiting directly.

This makes it easier for models that doesn't support KV cache shifting.
Currently in arg.cpp --no-context-shift is only enabled in LLAMA_EXAMPLE_MAIN, LLAMA_EXAMPLE_SERVER, LLAMA_EXAMPLE_IMATRIX, LLAMA_EXAMPLE_PERPLEXITY. As a result, for example, using llama-parallel with recurrent models will fail with message indicating that context-shift is not supported. But --no-context-shift isn't an available parameter for llama-parallel.

instead of exiting directly Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

common/common.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Disable KV cache shifting automatically for unsupported models

672983a

instead of exiting directly Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

ggerganov approved these changes Jan 3, 2025

View reviewed changes

common/common.cpp Outdated Show resolved Hide resolved

Update common/common.cpp

9c529a7

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

ggerganov merged commit 4b0c638 into ggerganov:master Jan 3, 2025
47 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable KV cache shifting automatically for unsupported models #11053

Disable KV cache shifting automatically for unsupported models #11053

MollySophia commented Jan 3, 2025

Disable KV cache shifting automatically for unsupported models #11053

Disable KV cache shifting automatically for unsupported models #11053

Conversation

MollySophia commented Jan 3, 2025