-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWQ: Activation-aware Weight Quantization??? #1685
Comments
Are you interested in implementing this new algorithm in Llama.cpp? The performance with 3 bits seems amazing |
Sorry,I'm not pro developer 😔 |
If my understanding is correct, it is that GPTQ and AWQ are stored in very similar formats and can be stored in the same format as well. |
Hi everyone, I have tried to make a PR to add AWQ. I really appreciate the comments to make it better, thanks! |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
https://github.com/mit-han-lab/llm-awq
AWQ: Activation-aware Weight Quantization sounds interesting 🧐🧐🧐
The text was updated successfully, but these errors were encountered: