Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chat template: save and load correctly for processors #33462

Merged
merged 8 commits into from
Sep 18, 2024

Conversation

zucchini-nlp
Copy link
Member

What does this PR do?

FIxes #33459. The chat template was not being saved after the last PR for adding LLaVa-OneVision, because I fixed a small bug when chat template was being saved in both places: in its own file and in processor config.

In this RP we never serialize chat template in the dict so that it is not saved and when saving we check for self attributes. Yer, I am not 100% sure if it is ok to not add the template to the dict. Or we should pop template when serializing to json, in to_json_string maybe, WDYT?

@zucchini-nlp zucchini-nlp changed the title Chat template Chat template: save and load correctly for processors Sep 13, 2024
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

Comment on lines 54 to 58
for key, value in self.prepare_processor_dict().items():
# chat templates are popped from dict
if key != "chat_template":
self.assertEqual(obj[key], value)
self.assertEqual(getattr(processor, key, None), value)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also assert the chat template is popped from the dict.

Suggested change
for key, value in self.prepare_processor_dict().items():
# chat templates are popped from dict
if key != "chat_template":
self.assertEqual(obj[key], value)
self.assertEqual(getattr(processor, key, None), value)
for key, value in self.prepare_processor_dict().items():
# chat templates are popped from dict
self.assertFalse(key == "chat_template")
self.assertEqual(obj[key], value)
self.assertEqual(getattr(processor, key, None), value)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, no, the test fails because the processor_dict for llava has a chat_template key, and we use it in other tests for init and save the processor for ex. This test is same as the general one, with the exception that chat templates cannot pass self.assertEqual(obj[key], value) check

So we just want to test all other processor kwargs except chat template, which is tested separately. By other kwargs, I mean the ones which will be added soon

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the exception that chat templates cannot pass self.assertEqual(obj[key], value) check

I see, I missed what was happening originally in the test. Isn't self.prepare_processor_dict().items() a bit redundant, as we force self.prepare_processor_dict() to only have one key, which is "chat_template" and so all of this logic is skipped?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, same way as almost all processors skip this test. For VLMs this test will become available when we enforce new processing logic for input expansion with image tokens. Until then, we can override it to prevent failing tests, instead of unitest.skip

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmmm, I don't think overriding to make it look like tests are passing is a great idea. Skipping is far better as it's easier to spot and track.

Part of the issue here is that this new behaviour still isn't being tested then, as we want to make sure that chat_template isn't in the processor_dict when saving out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, we can add one more assert in test_chat_template_is_saved to check what is the content of processor_dict

Oke, I'll skip it then with a comment explaining why and that we need to stop skipping it at some point

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect :)

zucchini-nlp and others added 3 commits September 16, 2024 13:36
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
@andimarafioti andimarafioti mentioned this pull request Sep 17, 2024
5 tasks
@amyeroberts amyeroberts self-requested a review September 17, 2024 10:53
Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating!

@zucchini-nlp zucchini-nlp merged commit db72894 into huggingface:main Sep 18, 2024
20 checks passed
BernardZach pushed a commit to BernardZach/transformers that referenced this pull request Dec 5, 2024
)

* fix

* add tests

* fix tests

* Update tests/models/llava/test_processor_llava.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix

* fix tests

* update tests

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

chat_template.json is not saved when using LlavaProcessor.save_pretrained()
3 participants