Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A generation issue: when ignore_eos=False and the model's pad_token==eos_t… #1539

Closed
wants to merge 1 commit into from

Conversation

YunLiu1
Copy link

@YunLiu1 YunLiu1 commented Dec 2, 2024

A generation issue: when ignore_eos=False and the model's pad_token==eos_token (like Llama3), the generated results in same batch size erased.

What does this PR do?

When generating text with Optimum-habana, if BS>1, ignore_eos=False, the model's pad_token==eos_token (like Llama3.1-8B),
This is an example:
I submit 2 prompts ("Hello world,", "How are you?"), the BS=2, the short one is padded at the left:
image
After generation, the first pad_token is recognized as eos_token, and the response is erased.

Fixes

I modified the post-process to ignore the left pad_tokens, and only erase the tokens after real eos_token.

Unit Tests

The changed codes passed this Unit Tests.
eos_test_py.txt

Function Tests

And it passed the Function Tests:
lm_eval mmlu_pro_business for Meta-Llama-3.1-8B-Instruct (pad_token=eos_toekn, bs=8):

Tasks Version Filter n-shot Metric Value Stderr
business 1 custom-extract 5 exact_match 0.4791 ± 0.0178

lm_eval mmlu_pro_business for llama2-7b (pad_token=0, bs=8):

Tasks Version Filter n-shot Metric Value Stderr
business 1 custom-extract 5 exact_match 0.1888 ± 0.0139

…oken (like Llama3), the generated results in same batch size erased.
@regisss
Copy link
Collaborator

regisss commented Dec 2, 2024

@YunLiu1 Can you provide an example of command that enables to reproduce this issue please?

@YunLiu1
Copy link
Author

YunLiu1 commented Dec 3, 2024

@YunLiu1 Can you provide an example of command that enables to reproduce this issue please?

Sure, because in run_generation.py the ignore_eos is always True, you need to change the code first,
Edit examples/text-generation/run_generation.py, L512, explicitly set "ignore_eos=False,",
Then run the cmd:

python3 ~/optimum-habana/examples/text-generation/run_generation.py
--model_name_or_path /host/mnt/disk3/hf_models/Meta-Llama-3.1-8B
--use_hpu_graphs --use_kv_cache --bf16 --batch_size 2
--warmup 0 --n_iterations 1
--prompt "Hello world," "How are you?"

There is no output for the short prompt "Hello world,"

@regisss
Copy link
Collaborator

regisss commented Dec 10, 2024

@YunLiu1 When I run this command, I get:

Input/outputs:
input 1: ('Hello world,',)
output 1.1: ('Hello world, I am a new member of the forum. I am a 20 year old male from the UK. I have been diagnosed with Aspergers and ADHD. I have been diagnosed with Aspergers for 2 years now. I have been diagnosed with ADHD for 1 year now. I have been diagnosed with depression for 2 years now. I have been diagnosed with anxiety for 1 year now. I have been diagnosed with OCD for 1 year now. I have been diagnosed with social',)

input 2: ('How are you?',)
output 2.1: ('How are you? I hope you are well. I am writing to you to ask for your help. I am a student at the University of the West Indies, Mona Campus. I am currently doing a research project on the topic of the impact of the COVID-19 pandemic on the mental health of Jamaican youth. I am hoping to get your help with this project. I am asking you to complete a survey that will take about 10 minutes of your time. The survey is completely anonymous and confidential. I am',)

which looks fine. Can you try again on the latest main branch and let me know if you still see it please?

Besides, you can set ignore_eos to False from the command line using the argument --no-ignore_eos.

@regisss
Copy link
Collaborator

regisss commented Dec 10, 2024

It's possible #1546 and #1569 helped to fix this

@YunLiu1
Copy link
Author

YunLiu1 commented Dec 11, 2024

@regisss
Hi I confirmed this issue has bee fixed in #1569, this PR is no longer needed.

@regisss regisss closed this Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants