-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically apply chat template in non-chat scenarios #1533
Conversation
f1ece12
to
e5fa889
Compare
If using a chat template is supposed to be a default behavior for .generate() method it is not aligned with HF Transformers lib. We should turn it off in the tools at least (both WWB and LLM-Bench). |
what's about HF e2e pipeline? |
|
what if it's instruction model? |
Double-checked, and it seems like HF changed the behaviour at some point for text-generation pipeline. Details. But the input should be formatted appropreatelly to trigger chat template usage. If the user just passes a string data, no chat template is applied. |
do you think it's better to add explicit flag, then? pipe.generate(prompt, apply_chat_template=True, max_new_tokens=40) |
This option looks good to me but for drop-in replacement of HF API to OV GenAI it is better to follow HF approach with message format. Anyway, they should have more experience and user's feedback. |
Should both ways be added - possibility to put |
34e5dfd
to
4d3783f
Compare
d7ab914
to
bb6a577
Compare
d60caa3
to
d55e1e0
Compare
to discuss:
|
faf1043
to
71b8c52
Compare
|
See #1533 Co-authored-by: Alexander Kozlov <kozzzloff@list.ru>
CVS-157276