-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chat template: save and load correctly for processors #33462
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing!
for key, value in self.prepare_processor_dict().items(): | ||
# chat templates are popped from dict | ||
if key != "chat_template": | ||
self.assertEqual(obj[key], value) | ||
self.assertEqual(getattr(processor, key, None), value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also assert the chat template is popped from the dict.
for key, value in self.prepare_processor_dict().items(): | |
# chat templates are popped from dict | |
if key != "chat_template": | |
self.assertEqual(obj[key], value) | |
self.assertEqual(getattr(processor, key, None), value) | |
for key, value in self.prepare_processor_dict().items(): | |
# chat templates are popped from dict | |
self.assertFalse(key == "chat_template") | |
self.assertEqual(obj[key], value) | |
self.assertEqual(getattr(processor, key, None), value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, no, the test fails because the processor_dict
for llava has a chat_template
key, and we use it in other tests for init and save the processor for ex. This test is same as the general one, with the exception that chat templates cannot pass self.assertEqual(obj[key], value)
check
So we just want to test all other processor kwargs except chat template, which is tested separately. By other kwargs, I mean the ones which will be added soon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with the exception that chat templates cannot pass self.assertEqual(obj[key], value) check
I see, I missed what was happening originally in the test. Isn't self.prepare_processor_dict().items()
a bit redundant, as we force self.prepare_processor_dict()
to only have one key, which is "chat_template"
and so all of this logic is skipped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, same way as almost all processors skip this test. For VLMs this test will become available when we enforce new processing logic for input expansion with image tokens. Until then, we can override it to prevent failing tests, instead of unitest.skip
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmmm, I don't think overriding to make it look like tests are passing is a great idea. Skipping is far better as it's easier to spot and track.
Part of the issue here is that this new behaviour still isn't being tested then, as we want to make sure that chat_template isn't in the processor_dict when saving out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, we can add one more assert in test_chat_template_is_saved
to check what is the content of processor_dict
Oke, I'll skip it then with a comment explaining why and that we need to stop skipping it at some point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect :)
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating!
What does this PR do?
FIxes #33459. The chat template was not being saved after the last PR for adding LLaVa-OneVision, because I fixed a small bug when chat template was being saved in both places: in its own file and in processor config.
In this RP we never serialize chat template in the dict so that it is not saved and when saving we check for
self
attributes. Yer, I am not 100% sure if it is ok to not add the template to the dict. Or we should pop template when serializing to json, into_json_string
maybe, WDYT?