Wrap `_prepare_4d_causal_attention_mask` as a leaf function #27236

michaelbenayoun · 2023-11-02T10:42:17Z

What does this PR do?

This wraps the _prepare_4d_causal_attention_mask as a FX leaf function for similar reasons than here.

The only consequence it has is that it will not be possible to edit this function by using torch.fx. It is not a big deal at all, but I will remove this constraint as soon as possible.

amyeroberts

Change LGTM - thanks for adding!

Happy to merge with @younesbelkada's approval.

cc @patrickvonplaten for reference

younesbelkada

The changes look good! My only question is that this seems to wrap _prepare_4d_causal_attention_mask all the time as long as fx is available, is it possible to wrap it only if users perform model tracing?

michaelbenayoun · 2023-11-02T11:00:28Z

No it needs to happen at the top-module level.

younesbelkada

LGTM, reading a bit the docs of the method it just registers the method as a "leaf function" without modifying anything - seems safe to me

https://pytorch.org/docs/stable/_modules/torch/fx/_symbolic_trace.html#wrap

michaelbenayoun · 2023-11-02T11:06:08Z

It is, the only difference it implies is that it will not be possible to edit this function via torch.fx.

HuggingFaceDocBuilderDev · 2023-11-02T11:06:32Z

The documentation is not available anymore as the PR was closed or merged.

michaelbenayoun · 2023-11-02T13:30:01Z

Actually I have a workaround that works for my purposes in optimum-neuron, I can reverse the changes or keep them like that, as you wish. It's not a big deal in any case.

amyeroberts · 2023-11-02T17:33:54Z

I think it's fine to leave as-is - people might want to use torch tracing outside of optimum-neuron.

…ace#27236) Wrap _prepare_4d_causal_attention_mask as a leaf function

Wrap _prepare_4d_causal_attention_mask as a leaf function

b47bf50

michaelbenayoun requested a review from younesbelkada November 2, 2023 10:55

amyeroberts approved these changes Nov 2, 2023

View reviewed changes

younesbelkada reviewed Nov 2, 2023

View reviewed changes

younesbelkada approved these changes Nov 2, 2023

View reviewed changes

amyeroberts merged commit 4557a0d into huggingface:main Nov 2, 2023

michaelbenayoun deleted the fx_for_for_pp_with_llama branch November 2, 2023 13:29

EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 19, 2023

Wrap _prepare_4d_causal_attention_mask as a leaf function (huggingf…

ae5186f

…ace#27236) Wrap _prepare_4d_causal_attention_mask as a leaf function

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrap `_prepare_4d_causal_attention_mask` as a leaf function #27236

Wrap `_prepare_4d_causal_attention_mask` as a leaf function #27236

michaelbenayoun commented Nov 2, 2023

amyeroberts left a comment

younesbelkada left a comment

michaelbenayoun commented Nov 2, 2023

younesbelkada left a comment

michaelbenayoun commented Nov 2, 2023

HuggingFaceDocBuilderDev commented Nov 2, 2023 •

edited

Loading

michaelbenayoun commented Nov 2, 2023

amyeroberts commented Nov 2, 2023

Wrap _prepare_4d_causal_attention_mask as a leaf function #27236

Wrap _prepare_4d_causal_attention_mask as a leaf function #27236

Conversation

michaelbenayoun commented Nov 2, 2023

What does this PR do?

amyeroberts left a comment

Choose a reason for hiding this comment

younesbelkada left a comment

Choose a reason for hiding this comment

michaelbenayoun commented Nov 2, 2023

younesbelkada left a comment

Choose a reason for hiding this comment

michaelbenayoun commented Nov 2, 2023

HuggingFaceDocBuilderDev commented Nov 2, 2023 • edited Loading

michaelbenayoun commented Nov 2, 2023

amyeroberts commented Nov 2, 2023

Wrap `_prepare_4d_causal_attention_mask` as a leaf function #27236

Wrap `_prepare_4d_causal_attention_mask` as a leaf function #27236

HuggingFaceDocBuilderDev commented Nov 2, 2023 •

edited

Loading