AWQ: Activation-aware Weight Quantization??? #1685

UnsGentoals · 2023-06-03T15:42:42Z

https://github.com/mit-han-lab/llm-awq

AWQ: Activation-aware Weight Quantization sounds interesting 🧐🧐🧐

shouyiwang · 2023-06-04T06:25:09Z

Are you interested in implementing this new algorithm in Llama.cpp? The performance with 3 bits seems amazing

UnsGentoals · 2023-06-04T07:49:44Z

Sorry,I'm not pro developer 😔
just a user who care about open llm news, and with tech background.☺️

qwopqwop200 · 2023-06-04T09:30:31Z

If my understanding is correct, it is that GPTQ and AWQ are stored in very similar formats and can be stored in the same format as well.
Additionally, currently GGML has imperfect support for the format of GPTQ, and improving it could save about 0.5 bit more.

namtranase · 2023-12-22T10:11:56Z

Hi everyone, I have tried to make a PR to add AWQ. I really appreciate the comments to make it better, thanks!
The PR

github-actions · 2024-04-10T01:07:52Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions bot added the stale label Mar 25, 2024

github-actions bot closed this as completed Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AWQ: Activation-aware Weight Quantization??? #1685

AWQ: Activation-aware Weight Quantization??? #1685

UnsGentoals commented Jun 3, 2023

shouyiwang commented Jun 4, 2023

UnsGentoals commented Jun 4, 2023

qwopqwop200 commented Jun 4, 2023

namtranase commented Dec 22, 2023

github-actions bot commented Apr 10, 2024

AWQ: Activation-aware Weight Quantization??? #1685

AWQ: Activation-aware Weight Quantization??? #1685

Comments

UnsGentoals commented Jun 3, 2023

shouyiwang commented Jun 4, 2023

UnsGentoals commented Jun 4, 2023

qwopqwop200 commented Jun 4, 2023

namtranase commented Dec 22, 2023

github-actions bot commented Apr 10, 2024