-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove Falcon style ROPE #35
Conversation
I have updated the model on huggingface. |
Hi @magician-blue , so do you mean the tl-chat model on HF is not compatible with this repo anymore ? |
@tairov We still can run with our repo. Change from mojo llama2.mojo tl-chat.bin \
-r falcon \
-z tok_tl-chat.bin \
-n 256 -t 0 -s 100 -i "<|im_start|>user\nGive me a python function to generate Fibonacci sequence<|im_end|>\n<|im_start|>assistant\n" to mojo llama2.mojo tl-chat.bin \
-r llama \
-z tok_tl-chat.bin \
-n 256 -t 0 -s 100 -i "<|im_start|>user\nGive me a python function to generate Fibonacci sequence<|im_end|>\n<|im_start|>assistant\n" |
If we can convert all HF llama model(they use falcon style rope) to llama style rope. Then we only need to implementone type of rope in our repo. This is what llama2.c and llama.cpp are doing. |
Looks cool. Could you share some details where is this |
The original convert file comes from llama2.c and I modify some part of it to support GQA. |
The next thing I will do is to convert openllama3b(12G RAM), llama2-chat-7b(28G RAM), vicuna-7b to test my convertor and our llama2.mojo. |
In this case I guess the convert.py is not needed in the repo. |
model could be converted using script from llama2c |
thank you! |
All HF llama model are falcon style ROPE and we can convert them to original llama style ROPE with a permutation.
This pull request solve the bug when converting HF GQA to gguf format.
I learned idea from it and fix the similar bug in the llama2.c's exports.py.
Now I successfully convert Tinyllama-1.1B-chat to llama style ROPE. So, we can remove the falcon ROPE part.
I have upload the new export.py and llama2.mojo.
Details:
python export.py tl-chat.bin --hf PY007/TinyLlama-1.1B-Chat-v0.2 --version 0
to convert the model