Update bloom attention forward reshape follwing the transformer change #1360

yeonsily · 2024-09-25T21:18:23Z

What does this PR do?

https://github.com/huggingface/transformers/blob/main/src/transformers/models/bloom/modeling_bloom.py#L262 _reshape() function covers OH transpose change. No need to transpose in OH.

The current code is failed at text-getneration CI.

CI run command: python3 /root/optimum-habana/examples/text-generation/run_generation.py --model_name_or_path bigscience/bloomz-7b1 --batch_size 1 --use_kv_cache --max_new_tokens 100 --use_hpu_graphs --bf16

Error message:
File "/usr/local/lib/python3.10/dist-packages/optimum/habana/transformers/models/bloom/modeling_bloom.py", line 142, in gaudi_bloom_attention_forward
query_layer = query_layer.transpose(1, 2).reshape(batch_size * self.num_heads, q_length, self.head_dim)
RuntimeError: shape '[32, 32, 128]' is invalid for input of size 438272

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Update bloom attention forward reshape follwing the transformer change

13f0aed

yeonsily requested review from regisss, libinta, jiminha and a user September 25, 2024 21:18

regisss approved these changes Sep 26, 2024

View reviewed changes

regisss merged commit 1abd6ee into transformers_future Sep 26, 2024
1 check passed

regisss deleted the bloom_fix branch September 26, 2024 07:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update bloom attention forward reshape follwing the transformer change #1360

Update bloom attention forward reshape follwing the transformer change #1360

yeonsily commented Sep 25, 2024 •

edited

Loading

Update bloom attention forward reshape follwing the transformer change #1360

Update bloom attention forward reshape follwing the transformer change #1360

Conversation

yeonsily commented Sep 25, 2024 • edited Loading

What does this PR do?

Before submitting

yeonsily commented Sep 25, 2024 •

edited

Loading