AWQ: Patch for mismatched devices in RotaryEmbedding #1480

jambayk · 2024-11-12T03:04:43Z

Describe your changes

In transformers>4.43 which is required for models such as llama-3.1 and phi-3.5, there is an open issue related to device mismatch in the rotary embedding module huggingface/transformers#32420
There is no fix yet in either transformers or autoawq so we patch the model adapter in Olive based on the package versions. This unblocks quantization using newer transformers version for llama like models. Fix is based on casper-hansen/AutoAWQ#630

AutoGPTQ has the same issue but is already fixed in main so users can install from source. Also there is no straigtforward way to do a similar patch with it. Tried adding model.rotary_embed to outside_layer_modules but it fails with get_device since the module has no parameters.

Checklist before requesting a review

Add unit tests for this change.
Make sure all tests can pass.
Update documents if necessary.
Lint and apply fixes to your code by running lintrunner -a
Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
Is this PR including examples changes? If yes, please remember to update example documentation in a follow-up PR.

(Optional) Issue link

patch awq for newer models and transformers

31b8ef1

devang-ml approved these changes Nov 12, 2024

View reviewed changes

jambayk merged commit 61876e2 into main Nov 12, 2024
25 checks passed

jambayk deleted the jambayk/device-map branch November 12, 2024 07:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AWQ: Patch for mismatched devices in RotaryEmbedding #1480

AWQ: Patch for mismatched devices in RotaryEmbedding #1480

jambayk commented Nov 12, 2024

AWQ: Patch for mismatched devices in RotaryEmbedding #1480

AWQ: Patch for mismatched devices in RotaryEmbedding #1480

Conversation

jambayk commented Nov 12, 2024

Describe your changes

Checklist before requesting a review

(Optional) Issue link