bugfix: torch.export failure caused by `_make_causal_mask` #35291

jiwoong-choi · 2024-12-16T08:52:51Z

What does this PR do?

Fix the torch.export failure caused by AttentionMaskConverter._make_causal_mask.

Recent changes in torch dynamo prevent mutations on tensors converted with aten::_to_copy. To address this, we can clone such tensor before performing in-place operation masked_fill_ only when the code is being compiled by torch dynamo. (relevant issue on PyTorch: pytorch/pytorch#127571)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

PyTorch: @gante @Rocketknight1

Recent changes in torch dynamo prevent mutations on tensors converted with aten::_to_copy. To address this, we can clone such tensor before performing in-place operation `masked_fill_` only when the code is being compiled by torch dynamo. (relevant issue: pytorch/pytorch#127571)

Rocketknight1 · 2024-12-16T14:03:07Z

This seems legit to me, but cc @ArthurZucker for core maintainer review

qubvel

Hi @jiwoong-choi, thanks for the update! I faced a similar issue with vision models and can confirm that this should fix torch.export. Alternatively, we could use a non-inplace operation for masked_fill, but your solution seems better because it does not change the original behavior.

src/transformers/modeling_attn_mask_utils.py

…mpiling`

ArthurZucker

Thanks for fixing

ArthurZucker · 2024-12-20T13:35:56Z

src/transformers/modeling_attn_mask_utils.py

+            # Recent changes in PyTorch prevent mutations on tensors converted with aten::_to_copy
+            # See https://github.com/pytorch/pytorch/issues/127571
+            if is_torchdynamo_compiling():
+                mask = mask.clone()
            mask.masked_fill_(context_mask, torch.finfo(dtype).min)


I agree with @qubvel, tho I don't mind modifying to have an inplace operation!

qubvel approved these changes Dec 16, 2024

View reviewed changes

src/transformers/modeling_attn_mask_utils.py Outdated Show resolved Hide resolved

qubvel added the torch export Issues and PRs related to torch.export compatibility label Dec 16, 2024

chore: use is_torchdynamo_compiling instead of `torch._dynamo.is_co…

303f441

…mpiling`

qubvel requested a review from ArthurZucker December 19, 2024 10:19

ArthurZucker approved these changes Dec 20, 2024

View reviewed changes

ArthurZucker merged commit 40292aa into huggingface:main Dec 20, 2024
20 of 22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bugfix: torch.export failure caused by `_make_causal_mask` #35291

bugfix: torch.export failure caused by `_make_causal_mask` #35291

jiwoong-choi commented Dec 16, 2024

Rocketknight1 commented Dec 16, 2024

qubvel left a comment

ArthurZucker left a comment

ArthurZucker Dec 20, 2024

bugfix: torch.export failure caused by _make_causal_mask #35291

bugfix: torch.export failure caused by _make_causal_mask #35291

Conversation

jiwoong-choi commented Dec 16, 2024

What does this PR do?

Before submitting

Who can review?

Rocketknight1 commented Dec 16, 2024

qubvel left a comment

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker Dec 20, 2024

Choose a reason for hiding this comment

bugfix: torch.export failure caused by `_make_causal_mask` #35291

bugfix: torch.export failure caused by `_make_causal_mask` #35291