Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to generate imatrix for DeekSeek, "KV cache shifting is not supported for this model (--no-context-shift to disable)" #10755

Closed
bartowski1182 opened this issue Dec 10, 2024 · 1 comment · Fixed by #10766

Comments

@bartowski1182
Copy link
Contributor

bartowski1182 commented Dec 10, 2024

Name and Version

b4273 for Ubuntu

Operating systems

Linux

GGML backends

CPU, CUDA

Hardware

EPYC 7702, 3090

Models

DeekSeek 2.5 1210

https://huggingface.co/deepseek-ai/DeepSeek-V2.5-1210

Problem description & steps to reproduce

When running imatrix, I receive the error:

KV cache shifting is not supported for this model (--no-context-shift to disable)

However this error doesn't go away even if I pass that parameter:

./llama-imatrix -m /models_out/DeepSeek-V2.5-1210-GGUF/DeepSeek-V2.5-1210-Q8_0.gguf -f /training_dir/calibration_datav3.txt --output-file /models_out/DeepSeek-V2.5-1210-GGUF/DeepSeek-V2.5-1210.imatrix -t 120 --no-context-shift

Also tried with flash attention -fa to no avail

First Bad Commit

No response

Relevant log output

llama_new_context_with_model: n_ctx_per_seq (512) < n_ctx_train (163840) -- the full capacity of the model will not be utilized
llama_kv_cache_init:        CPU KV buffer size =  2400.00 MiB
llama_new_context_with_model: KV self size  = 2400.00 MiB, K (f16): 1440.00 MiB, V (f16):  960.00 MiB
llama_new_context_with_model:        CPU  output buffer size =     0.39 MiB
llama_new_context_with_model:      CUDA0 compute buffer size =  1422.00 MiB
llama_new_context_with_model:  CUDA_Host compute buffer size =    81.01 MiB
llama_new_context_with_model: graph nodes  = 4480
llama_new_context_with_model: graph splits = 1080 (with bs=512), 1 (with bs=1)
common_init_from_params: KV cache shifting is not supported for this model (--no-context-shift to disable)'
main : failed to init
@bartowski1182
Copy link
Contributor Author

Adding LLAMA_EXAMPLE_IMATRIX to --no-context-shift resolves the issue, I assume that's a reasonable solve?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant