Fix graph for RWKV6Qwen2 #11445

MollySophia · 2025-01-27T09:04:37Z

The token-shifting part was not correctly done in the previous implementation. It wasn't copied back to k_cache after a decode. As a result, the model was always lerping towards zero when decoding. Prefill(and as a result, PPL evaluation) wasn't affected.

Somehow this mistake didn't affect much on text generation as well lol (maybe the large 32B model already got enough context information into the wkv state?). That's why the bug wasn't found previously.

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

Animaxx added a commit to Animaxx/llama.cpp that referenced this pull request Jan 28, 2025

https://github.com/ggerganov/llama.cpp/pull/11445

9e2634d

slaren approved these changes Jan 28, 2025

View reviewed changes

MollySophia merged commit 325afb3 into ggml-org:master Jan 29, 2025
45 checks passed

ggerganov mentioned this pull request Jan 29, 2025

llama : refactor llama_kv_cache, llama_context and llm_build_context #11213

Draft

21 tasks

tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025

llama: fix missing k_cache store for rwkv6qwen2 (ggml-org#11445)

055d3aa

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025

llama: fix missing k_cache store for rwkv6qwen2 (ggml-org#11445)

29a94ae

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

mglambda pushed a commit to mglambda/llama.cpp that referenced this pull request Mar 8, 2025

llama: fix missing k_cache store for rwkv6qwen2 (ggml-org#11445)

ed7446e

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix graph for RWKV6Qwen2 #11445

Fix graph for RWKV6Qwen2 #11445

MollySophia commented Jan 27, 2025

Fix graph for RWKV6Qwen2 #11445

Fix graph for RWKV6Qwen2 #11445

Conversation

MollySophia commented Jan 27, 2025