Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Chatllama] Should I add a <end_of_text> at the end of sentence? #260

Open
bino282 opened this issue Mar 13, 2023 · 3 comments
Open

[Chatllama] Should I add a <end_of_text> at the end of sentence? #260

bino282 opened this issue Mar 13, 2023 · 3 comments

Comments

@bino282
Copy link

bino282 commented Mar 13, 2023

When I train a actor model with bloom-560M. I realize that , the model generates the text repeated at the end. The model always generates enough words predefined by max_length but it;s not stop early. Should I add a <end_of_text> at the end of sentence when training?

@PierpaoloSorbellini
Copy link
Collaborator

@bino282 Yes, with HF models the text is repeated, we are aware of the problem and will be releasing a patch for this very soon. We will get back to you as soon as possible.

@PierpaoloSorbellini PierpaoloSorbellini changed the title Should I add a <end_of_text> at the end of sentence? [Chatllama] Should I add a <end_of_text> at the end of sentence? Mar 14, 2023
@allaccs
Copy link

allaccs commented Mar 27, 2023

I am having the same problem of repeated text, what is the current workaround?

@PierpaoloSorbellini
Copy link
Collaborator

Hi @bino282 @allaccs
Yes we have the same issue too with some HF models.
currently we have tried to add a EOS to each sequence to make the model understand where to put this token,
and added some parameters to generate function of the actor form HF that should help in removing the repetition.
The last version is in the PR #306 please refer to that and contact me again if you notice that the problem persist.
thanks for feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants