-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama: Add support for RWKV v7 architecture #11452
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
There isn't much peformance gain though. Just for more op coverage Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Update: added support for fla-hub's rwkv7 hf model format. (https://huggingface.co/fla-hub/rwkv7-1.5B-world) |
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Just a heads up, this will likely take some time to merge - I want to finish #11213 first and then figure out how to fit RWKV in the new code, likely with it's own implementation of |
That’s great! I can help with that too |
Great, keep a look at the #11213 PR. It's still very messy, but I hope it will soon start to make sense. |
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
They passes on my m2 and m4 devices :| Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
@BlinkDL 's explanation of RWKV v7:
RWKV-7 as a meta-in-context learner
Also there are plenty of tests on trained models (currently 0.1B and 0.4B) posted on his x account. Larger models are coming too in several days.
Current available RWKV v7 model repos in HF format:
https://huggingface.co/SmerkyG/RWKV7-Goose-0.1B-World2.8-HF (not an official published one, tensor names are expected to change in the future)
https://huggingface.co/mollysama/rwkv-7-world-0b4-hf
https://huggingface.co/mollysama/rwkv-7-world-1b5-hf
https://huggingface.co/RWKV-Red-Team/ARWKV-7B-Preview-0.1 (hybrid model with rwkv v7 "attn" and qwen2.5 7B's mlp, distilled from qwen2.5)
This PR contains:
GGML_OP_L2_NORM
that applies pytorch-style l2 normalization, along the rows. Tested with CPU, CUDA, SYCL, Vulkan, Metal backends.GGML_OP_RWKV_WKV7
which is the core of the RWKV v7 architecture. Implemented the naive recurrent wkv7 kernel in CPU, CUDA, SYCL, Vulkan, Metal.TODO:
Note: Current benchmark of ARWKV7-7B f16
which is way faster than RWKV v6 7B when prefilling (still a bit slower than Qwen2.5 7B).