Fix contrastive search to correctly handle input with padding #33507

ducviet00 · 2024-09-16T12:56:00Z

What does this PR do?

This PR fixes contrastive search to correctly handle input padding in decoder-only & encoder-decoder models.

Details

I encountered the issue and discovered that the Contrastive Search implementation is highly sensitive to padded tokens.

For example, with the prompt: The whispered legends of the haunted mansion spoke, the output from Hugging Face without padding is:

The whispered legends of the haunted mansion spoke of the "souls of the dead" who were "falling out of the sky" and "falling into the sea."\n

However, with padding tokens, the output becomes:

The whispered legends of the haunted mansion spoke of the "soul of the dead" and the "blood of the dead."\nThe ghost of Dr. H. P. Lovecraft was a man

You can check the Colab notebook that demonstrates the issue here

How did I fix the issue

The issue arises when input_ids contain padded tokens; the hidden_states of the model then include embeddings for these padded tokens if the model doesn't eliminate them during processing. The _ranking_fast function also calculates values based on these padded tokens, leading to incorrect outputs. This is critical because it significantly degrades the model's performance.

To fix this, I created a cosine_matrix_mask based on the attention_mask and penalized the cosine_matrix using this mask (ignoring padding positions) by applying large negative values. After that, I padded the cosine_matrix_mask with ones to match the output length.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@gante @ArthurZucker @amyeroberts @Rocketknight1

gante

@ducviet00 thank you for identifying the underlying numerical issue, proposing a very reasonable fix, and implementing it with a test 💛

I agree with the suggested fix: by masking cosine_matrix with a large negative value when the corresponding tokens are masked, degeneration_penalty will never be related to the masked tokens and, therefore, masked tokens will not have an impact on contrastive_score.

I've added a few minor suggestions

src/transformers/generation/utils.py

tests/generation/test_contrastive_search.py

ducviet00 · 2024-09-17T16:52:28Z

@gante Thanks for your feedback! I initially thought encoder-decoder models wouldn’t be affected since decoder_input_ids don’t seem to need padding tokens, except in continuous batching (not in this case). Applying the cosine_matrix_mask to these models might add some minor overhead, though. I hadn’t considered the decoder prompt before, so thanks for pointing that out!

I’ll push an update with support for encoder-decoder models based on your suggestions soon.

ducviet00 · 2024-09-17T20:11:02Z

@gante
I push an update to support encode-decoder models. I also added test for t5 model, with padding for decoder_input_ids. On the main branch, outputs_with_padding:

Ich muss diese Aufgabe noch vor Ende des Tages beenden.

It should be:

Ich muss diese Aufgabe vor Ende des Tages beenden.

You can check the code inside test_padding_input_contrastive_search_t5, in tests/generation/test_utils.py

ducviet00 · 2024-09-17T20:20:07Z

src/transformers/generation/utils.py

+        # Create cosine_matrix_mask based on the attention_mask
+        cosine_matrix_mask = torch.ones_like(input_ids, dtype=torch.long)
+        if self.config.is_encoder_decoder:
+            if "decoder_attention_mask" in model_kwargs and model_kwargs["decoder_attention_mask"] is not None:
+                cosine_matrix_mask = model_kwargs["decoder_attention_mask"]
+        else:
+            cosine_matrix_mask = model_kwargs["attention_mask"]
+        cosine_matrix_mask = cosine_matrix_mask.repeat_interleave(top_k, dim=0)
+


This code initializes a default mask and then updates it based on the model type.

For encoder-decoder models, if model_kwargs contains a decoder_attention_mask (and it is not None), cosine_matrix_mask is set to this mask. If decoder_attention_mask is missing, it falls back to the default mask.

For decoder-only models, cosine_matrix_mask is set to the attention_mask from model_kwargs.

Please let me know if there are any additional logic checks that need to be added

ducviet00 · 2024-09-19T10:59:31Z

Hi @gante
PTAL 🤗

gante

Thank you for iterating 💛

LysandreJik

Very impressive change and tests! Thanks for your PR @ducviet00

gante · 2024-09-20T15:52:25Z

@ducviet00 thank you for making transformers better for everyone 💛

…gface#33507) * fix: handle padding in contrastive search for decoder-only models * fix: handle padding in contrastive search for encoder-decoder models * tests: move padding contrastive test to test_util, add t5 test * fix: handle if model_kwargs["decoder_attention_mask"] is None * refactor: improve padding input contrastive search generation tests * chore: _ranking_fast to use LongTensor for cosine_matrix_mask

ducviet00 force-pushed the fix-contrastive-search-inconsistency branch from 133f680 to f311827 Compare September 16, 2024 13:01

amyeroberts added the Generation label Sep 16, 2024

ducviet00 force-pushed the fix-contrastive-search-inconsistency branch 4 times, most recently from b0f508d to f0add62 Compare September 17, 2024 07:49

gante reviewed Sep 17, 2024

View reviewed changes

ducviet00 commented Sep 17, 2024

View reviewed changes

ducviet00 added 4 commits September 18, 2024 03:21

fix: handle padding in contrastive search for decoder-only models

055b692

fix: handle padding in contrastive search for encoder-decoder models

daa2696

tests: move padding contrastive test to test_util, add t5 test

a10ec98

fix: handle if model_kwargs["decoder_attention_mask"] is None

6a86c0d

ducviet00 requested a review from gante September 17, 2024 20:22

ducviet00 force-pushed the fix-contrastive-search-inconsistency branch from cde4634 to 6a86c0d Compare September 17, 2024 20:25

refactor: improve padding input contrastive search generation tests

b8a08dd

ducviet00 force-pushed the fix-contrastive-search-inconsistency branch from 955fdfb to b8a08dd Compare September 18, 2024 03:18

chore: _ranking_fast to use LongTensor for cosine_matrix_mask

21d0207

ducviet00 changed the title ~~Fix contrastive search to correctly handle input padding in decoder-only models~~ Fix contrastive search to correctly handle input padding Sep 19, 2024

ducviet00 changed the title ~~Fix contrastive search to correctly handle input padding~~ Fix contrastive search to correctly handle input with padding Sep 19, 2024

gante approved these changes Sep 19, 2024

View reviewed changes

gante requested a review from LysandreJik September 19, 2024 18:07

LysandreJik approved these changes Sep 20, 2024

View reviewed changes

gante merged commit dc8b6ea into huggingface:main Sep 20, 2024
20 checks passed

vidyasiv mentioned this pull request Sep 30, 2024

transformers_future: contrastive search failing with Incompatible input shapes, broadcast not possible. huggingface/optimum-habana#1385

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix contrastive search to correctly handle input with padding #33507

Fix contrastive search to correctly handle input with padding #33507

ducviet00 commented Sep 16, 2024 •

edited

Loading

gante left a comment

ducviet00 commented Sep 17, 2024

ducviet00 commented Sep 17, 2024

ducviet00 Sep 17, 2024

ducviet00 commented Sep 19, 2024

gante left a comment

LysandreJik left a comment

gante commented Sep 20, 2024

Fix contrastive search to correctly handle input with padding #33507

Fix contrastive search to correctly handle input with padding #33507

Conversation

ducviet00 commented Sep 16, 2024 • edited Loading

What does this PR do?

Details

How did I fix the issue

Before submitting

Who can review?

gante left a comment

Choose a reason for hiding this comment

ducviet00 commented Sep 17, 2024

ducviet00 commented Sep 17, 2024

ducviet00 Sep 17, 2024

Choose a reason for hiding this comment

ducviet00 commented Sep 19, 2024

gante left a comment

Choose a reason for hiding this comment

LysandreJik left a comment

Choose a reason for hiding this comment

gante commented Sep 20, 2024

ducviet00 commented Sep 16, 2024 •

edited

Loading