Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWQ: Activation-aware Weight Quantization??? #1685

Closed
UnsGentoals opened this issue Jun 3, 2023 · 5 comments
Closed

AWQ: Activation-aware Weight Quantization??? #1685

UnsGentoals opened this issue Jun 3, 2023 · 5 comments
Labels

Comments

@UnsGentoals
Copy link

https://github.com/mit-han-lab/llm-awq

AWQ: Activation-aware Weight Quantization sounds interesting 🧐🧐🧐

@shouyiwang
Copy link

Are you interested in implementing this new algorithm in Llama.cpp? The performance with 3 bits seems amazing

@UnsGentoals
Copy link
Author

Sorry,I'm not pro developer 😔
just a user who care about open llm news, and with tech background.☺️

@qwopqwop200
Copy link

If my understanding is correct, it is that GPTQ and AWQ are stored in very similar formats and can be stored in the same format as well.
Additionally, currently GGML has imperfect support for the format of GPTQ, and improving it could save about 0.5 bit more.

@namtranase
Copy link
Contributor

Hi everyone, I have tried to make a PR to add AWQ. I really appreciate the comments to make it better, thanks!
The PR

@github-actions github-actions bot added the stale label Mar 25, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants