Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix graph for RWKV6Qwen2 #11445

Merged
merged 1 commit into from
Jan 29, 2025
Merged

Conversation

MollySophia
Copy link
Collaborator

The token-shifting part was not correctly done in the previous implementation. It wasn't copied back to k_cache after a decode. As a result, the model was always lerping towards zero when decoding. Prefill(and as a result, PPL evaluation) wasn't affected.

Somehow this mistake didn't affect much on text generation as well lol (maybe the large 32B model already got enough context information into the wkv state?). That's why the bug wasn't found previously.

Verified

This commit was signed with the committer’s verified signature.
headius Charles Oliver Nutter
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Animaxx added a commit to Animaxx/llama.cpp that referenced this pull request Jan 28, 2025
@MollySophia MollySophia merged commit 325afb3 into ggml-org:master Jan 29, 2025
45 checks passed
tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
mglambda pushed a commit to mglambda/llama.cpp that referenced this pull request Mar 8, 2025
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants