Fix `FbgemmFp8Linear` not preserving tensor shape #33239

vgel · 2024-09-01T04:21:25Z

What does this PR do?

Fixes #32868 : FbgemmFp8Linear (and models using it, like llama3-405b-fp8) would incorrectly squash higher-dim tensors:

>>> emd = base_model.model.embed_tokens
>>> emd.shape
torch.Size([32, 23, 16384])
>>> type(base_model.model.model.layers[0].mlp.up_proj)
torch.nn.modules.linear.Linear
>>> base_model.model.model.layers[0].mlp(emd).shape
torch.Size([32, 23, 16384])
>>> type(base_model.model.model.layers[1].mlp.up_proj)
transformers.integrations.fbgemm_fp8.FbgemmFp8Linear
>>> base_model.model.model.layers[1].mlp(emd).shape
torch.Size([736, 16384]) # <-------------------------------- wrong!!

This fixes that (and adds tests). See the linked issue for more details.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@SunMarc

ended up adding the reshape at the end, after f8f8bf16_rowwise, because adding it directly after quantize_fp8_per_row caused f8f8bf16_rowwise to drop the seq_len dimension. (i.e., (17, 23, 1014) -> (17, 1024))

SunMarc

Thanks for finding the issue and fixing it @vgel ! Left a suggestion !

src/transformers/integrations/fbgemm_fp8.py

LysandreJik

Nice!

vgel · 2024-09-09T07:28:49Z

@SunMarc Pinging for addressed comment, let me know if there's anything else!

SunMarc

Small correction ! LGTM apart from that

src/transformers/integrations/fbgemm_fp8.py

HuggingFaceDocBuilderDev · 2024-09-11T11:46:28Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

* add tests for linear shape behavior * fix linear shape behavior ended up adding the reshape at the end, after f8f8bf16_rowwise, because adding it directly after quantize_fp8_per_row caused f8f8bf16_rowwise to drop the seq_len dimension. (i.e., (17, 23, 1014) -> (17, 1024)) * save shape up front + comment

vgel added 2 commits August 31, 2024 21:13

add tests for linear shape behavior

dcac907

fix linear shape behavior

7029f64

ended up adding the reshape at the end, after f8f8bf16_rowwise, because adding it directly after quantize_fp8_per_row caused f8f8bf16_rowwise to drop the seq_len dimension. (i.e., (17, 23, 1014) -> (17, 1024))

SunMarc approved these changes Sep 2, 2024

View reviewed changes

src/transformers/integrations/fbgemm_fp8.py Outdated Show resolved Hide resolved

LysandreJik approved these changes Sep 3, 2024

View reviewed changes

save shape up front + comment

c8a3a8d

SunMarc approved these changes Sep 9, 2024

View reviewed changes

src/transformers/integrations/fbgemm_fp8.py Show resolved Hide resolved

SunMarc merged commit e719b65 into huggingface:main Sep 11, 2024
23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `FbgemmFp8Linear` not preserving tensor shape #33239

Fix `FbgemmFp8Linear` not preserving tensor shape #33239

vgel commented Sep 1, 2024

SunMarc left a comment

LysandreJik left a comment

vgel commented Sep 9, 2024

SunMarc left a comment

HuggingFaceDocBuilderDev commented Sep 11, 2024

Fix FbgemmFp8Linear not preserving tensor shape #33239

Fix FbgemmFp8Linear not preserving tensor shape #33239

Conversation

vgel commented Sep 1, 2024

What does this PR do?

Before submitting

Who can review?

SunMarc left a comment

Choose a reason for hiding this comment

LysandreJik left a comment

Choose a reason for hiding this comment

vgel commented Sep 9, 2024

SunMarc left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Sep 11, 2024

Fix `FbgemmFp8Linear` not preserving tensor shape #33239

Fix `FbgemmFp8Linear` not preserving tensor shape #33239