-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add lightweight tests for LoRA #8708
Comments
I'll need time to trainings some adapters for testing (maybe I'll extend it to test other architectures than llama), so I created this TODO for tracking |
Hi! I'd be happy to help with this issue, is there anything I start working on? Like writing the test script or training the gguf adapters? |
Part of this task will be done with #8857 where I will do a simple lora hotswap test. The training part will be a bit tricky so it would be nice if you can help. What's missing now is to test with other arch like gemma, phi3, etc. This requires doing these steps:
The goal is to have overfitted models smaller than 50MB in size, which will be useful to be used in CI test. Because models are overfitted, we expect it to output the same thing everytime. If it's not, then we have problem in the code. |
@ngxson I'm almost there with Gemma-2. Namely:
Only issue is that in the last step the output quality does not seem to be using the adapter (though a first debugging suggests llama-cli is using the adapter successfully). I need a bit more time to figure this out. In the meantime, how should I organise this code and files? |
Sounds great, thanks. Btw I forgot to mention, don't use stock Also you can take the data parsing code from https://huggingface.co/ggml-org/stories15M_MOE/blob/main/finetune.ipynb This make the chance to generate words like
I don't have a clear idea for now, but I think we can start by add a script under
You can put the model on your hf account for now, we will see later if we can move it to ggml-org |
@ngxson Quick update: Inference:
As many things could be going wrong, I wanted to check first that the layer's weights in safetensor are the same in gguf. I am using this code to print weights, but the printed weights are of order Question: how do I print out the ggml tensor's weights? |
Nice, thanks for the info.
You can maybe have a look at Another option is to use You can then use pytorch to compare tensors |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Ref: #8687 (comment)
(cc @ggerganov)
TODO:
llama-cli -m base_model.gguf --lora lora_adapter.gguf
llama-export-lora
, then re-run the merged.gguf to verify it outputs the same thing as aboveOptionally: make some small stories model with different arch, for example gemma, phi,...
The text was updated successfully, but these errors were encountered: