Llava: add default chat templates #31691

zucchini-nlp · 2024-06-28T10:59:36Z

What does this PR do?

This PR adds default chat templates for Llava models, so that user-defined models on the hub do not fail when using apply_chat_template. Users will see a warning in any case that the default template is used

Chat templates were added in all llava-hf models on the hub, can be verified by the below script

messages = [{
            "role": "user",
            "content": [
                {"type": "text", "text": "What’s the content of this image?"},
                {"type": "image"},
                ],
        },
        {
            "role": "assistant",
            "content": [{"type": "text", "text": "This picture shows a red stop sign."},]
        }
    ]
processor = AutoProcessor.from_pretrained("llava-hf/vip-llava-7b-hf")

convo = processor.apply_chat_template(messages)
>>> ###Human: <image>\nWhat’s the content of this image?###Assistant: This picture shows a red stop sign.

amyeroberts

Thanks for adding!

Overall looks good, some nits, just needs some more clarifying language

src/transformers/models/llava/processing_llava.py

amyeroberts · 2024-06-28T11:09:09Z

src/transformers/models/llava/processing_llava.py

+
+        Will create outputs like:
+        ```
+        USER: <image>\nWhat is the content of this image? ASSITANT: This picture shows a red stop sign


Why is the image token being place before the text, when the text comes first in the content list? If this is the case, then this should be detailed above in bullet points of behaviour

Llava models need image to be placed before text, in all cases. I thought to make the template in a way that it format correctlly even if the users pass "image" in arbitrary order

I can make it the same order as from user, then we need to update docs/readme on the hub explaining how to use the template correctly. WDYT, better to use user-order?

src/transformers/models/llava/processing_llava.py

src/transformers/models/llava_next/processing_llava_next.py

amyeroberts · 2024-06-28T11:14:15Z

src/transformers/models/llava_next/processing_llava_next.py

+
+        Will create outputs like:
+        ```
+        USER: <image>\nWhat is the content of this image? ASSITANT: This picture shows a red stop sign


Same q here re image token being first

HuggingFaceDocBuilderDev · 2024-06-28T11:19:54Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

NielsRogge

Would it be possible to add a test_chat_template test to https://github.com/huggingface/transformers/blob/main/tests/models/llava/test_processor_llava.py and so on?

Just an assertion which looks like this:

prompt = "What is in this image?"

manually_formatted_prompt = f"USER: <image>\n{prompt} ASSISTANT:"

messages = ...

formatted_prompt = processor.apply_chat_template(messages, add_generation_prompt=True)

self.assertEquals(manually_formatted_prompt, formatted_prompt)

Cause I've seen that the templates are quite tricky to get 100% right at the space level.

docs/source/en/model_doc/llava.md

docs/source/en/model_doc/llava_next.md

docs/source/en/model_doc/vipllava.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

zucchini-nlp · 2024-06-28T13:00:04Z

The tests will be failing until we add the templates on the hub. I will ping when we decide how to tackle the breaking change and it's is ready for review

zucchini-nlp · 2024-07-10T13:11:15Z

Removed default chat templates (#31733 already removed all defaults) and added a hack to load chat templates from a separate json file called "chat_template.json". We can leave it as is for 2-3 major versions and then remove, because I guess everyone will have a version with chat_template in init by then.

Ready for review @amyeroberts :)

amyeroberts

Thanks for working on adding this!

We need to be careful about how we handle backwards and future compatibility with regards to loading the chat template.

We'll also need to add general tests for the processor (perhaps in test_processor_common.py) which check the loading when a chat_template.json is or isn't present

docs/source/en/model_doc/llava_next.md

docs/source/en/model_doc/vipllava.md

amyeroberts · 2024-07-11T15:04:02Z

src/transformers/processing_utils.py

+                # load chat template from a separate json if exists
+                # TODO @raushan: remove this hack after a few major releases


We can't do for two reasons:

Older versions of transformers will still break as processors won't accept the chat template in the config. We can't assume people will have the newest versions installed.

People may have local copies of chat_template.json after this update. Even if we updated all public checkpoints on the hub to have the templates in the config, this would break things for anyone who is using local checkpoint as even custom templates.

FWIW - I think it's a lot cleaner having the chat template in a separate file anyway. They can be verbose and can easily clutter up the config files. Similar to keeping the vocab files separate for the tokenizers

Hmm I thought we could deprecate the chat_template.json when the old versions are very old to be in someone's env. I am okay with leaving chat_template.json but wouldn't that be different from what we have in LLMs?

Another option I proposed was to add the template in processor's tokenizer. If we don't want to have the template in processor.config, this will be the better option IMO. WDYT?

Older versions of transformers will still break as processors won't accept the chat template in the config.

Oh, also, continuing on this. We can also discuss this internally on slack later. We recently added an option for some processors to accept any kwargs and I was hoping to start integrating new kwargs for VLMs. Does this comment mean that we can't do it and will need another hack to load those kwargs?

Hmm I thought we could deprecate the chat_template.json when the old versions are very old to be in someone's env

The problem is - when would this happen? There's some people who are still installing transformers v3, so we can't assume everyone will be working from the latest versions, even if it's a few months out. We can deprecate things within the code which we have full control over. Unfortunately, as the configs sit outside the library it means people can use them from the first version they're introduced until now.

Another option I proposed was to add the template in processor's tokenizer. If we don't want to have the template in processor.config, this will be the better option IMO. WDYT?

Just to be make sure we're talking about the same thing, this would mean adding into the tokenizer's config for the checkpoint e.g. the tokenizer config here for a llava checkpoint?

It's a solution, but I don't think it semantically makes sense: the tokenizer is for processing text, whereas this introduces information about images.

We recently added an option for some processors to accept any kwargs and I was hoping to start integrating new kwargs for VLMs. Does this comment mean that we can't do it and will need another hack to load those kwargs?

We definitely want this feature added for our processors, and gives us flexibility with future compatibility. It won't however fix things for older versions. My understanding is that if the chat_template gets added into the config, then older versions would still break when trying to load the config file.

It's a solution, but I don't think it semantically makes sense: the tokenizer is for processing text, whereas this introduces information about images.

Yes, I meant that checkpoint. Now I see why it's not in tokenizer's config, then this workaround of having a separate json for templates is indeed better.

It won't however fix things for older versions. My understanding is that if the chat_template gets added into the config, then older versions would still break when trying to load the config file.

Yeah, this will be the problem in any case as we won't have compatibility with older transformers versions given how breaking are the new changes. Dealing with hub-transformers compatibility is harder than it looked like 😅 I guess for that feature of VLM processor refactor we'll have to wait a while and try to change slowly, to see users reactions...

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

zucchini-nlp · 2024-07-16T05:11:41Z

@amyeroberts I deleted a comment that it's a temporary hack, as we decoded to leave templates in their own json files. Ready for the final review

amyeroberts

Thanks for adding this!

Pretty much all nits, apart from one question when pretrained_model_name_or_path is a file

tests/models/llava_next/test_processor_llava_next.py

tests/models/vipllava/test_processor_vipllava.py

docs/source/en/model_doc/vipllava.md

docs/source/en/model_doc/llava.md

src/transformers/processing_utils.py

amyeroberts · 2024-07-17T10:23:42Z

src/transformers/processing_utils.py

+                text = reader.read()
+            chat_template = json.loads(text)["chat_template"]


Is there a reason for doing this in two steps i.e. getting text then json.loads instead of json.load in the reader context directly?

ah, will simplify it. I was copying from processor

Can't be loaded as json because it was saved as TextIOWrapper. Actually I don't know why we save it this way, was copying from processors. Maybe it has something to do with safe-saving 🤔

huh, interesting. Well, good to know :) Thanks for investigating

src/transformers/processing_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

fix save_pretrained

…1691 (huggingface#32921) fix save_pretrained

add default chat templates

f577ecb

zucchini-nlp requested review from amyeroberts and NielsRogge June 28, 2024 10:59

amyeroberts reviewed Jun 28, 2024

View reviewed changes

zucchini-nlp and others added 3 commits June 28, 2024 16:41

Update src/transformers/models/llava/processing_llava.py

1831329

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update src/transformers/models/llava_next/processing_llava_next.py

d684157

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

more clear docstring and docs

a22c149

NielsRogge reviewed Jun 28, 2024

View reviewed changes

docs/source/en/model_doc/llava.md Outdated Show resolved Hide resolved

docs/source/en/model_doc/llava_next.md Outdated Show resolved Hide resolved

docs/source/en/model_doc/vipllava.md Outdated Show resolved Hide resolved

zucchini-nlp and others added 4 commits June 28, 2024 17:41

Update docs/source/en/model_doc/llava.md

c4c1880

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

Update docs/source/en/model_doc/llava_next.md

ac07a33

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

Update docs/source/en/model_doc/vipllava.md

41afdbd

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

add tests

07a91be

zucchini-nlp added 3 commits July 10, 2024 14:24

remove default templates (see huggingface#31733)

400a8b2

load chat template from another file

2be86fc

Merge branch 'main' into chat_templates

3ff974f

zucchini-nlp requested a review from amyeroberts July 10, 2024 13:11

amyeroberts reviewed Jul 11, 2024

View reviewed changes

zucchini-nlp and others added 3 commits July 12, 2024 15:47

Update docs/source/en/model_doc/llava_next.md

3ad481b

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

revert some changes in docs

0f0b8a2

forgot vipllava

2a3df50

NielsRogge requested a review from Rocketknight1 July 12, 2024 12:10

zucchini-nlp and others added 3 commits July 15, 2024 11:10

Merge branch 'huggingface:main' into chat_templates

dd7aad9

chat template file is not temporary hack

7c215bd

Merge remote-tracking branch 'upstream/main' into chat_templates

78a5876

zucchini-nlp force-pushed the chat_templates branch from 5108c8e to 78a5876 Compare July 16, 2024 05:10

zucchini-nlp requested a review from amyeroberts July 16, 2024 05:11

zucchini-nlp added 2 commits July 17, 2024 08:18

not that file

6abf2d6

similarly modify save_pretrained

8ef3e2e

amyeroberts approved these changes Jul 17, 2024

View reviewed changes

zucchini-nlp and others added 13 commits July 17, 2024 17:22

Update tests/models/llava_next/test_processor_llava_next.py

eba4512

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update tests/models/vipllava/test_processor_vipllava.py

62c4ac6

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update docs/source/en/model_doc/vipllava.md

ac01cdd

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update src/transformers/processing_utils.py

109e198

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update src/transformers/processing_utils.py

54451dc

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update docs/source/en/model_doc/vipllava.md

11fc70c

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update docs/source/en/model_doc/llava.md

5cb21ec

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update docs/source/en/model_doc/llava.md

a127f1d

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update docs/source/en/model_doc/llava_next.md

4850a3b

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update docs/source/en/model_doc/llava_next.md

ca2696f

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update src/transformers/processing_utils.py

97b227c

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update docs/source/en/model_doc/llava_next.md

0809d57

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

fix

6e2153f

zucchini-nlp merged commit b873234 into huggingface:main Jul 19, 2024
23 checks passed

leloykun mentioned this pull request Aug 21, 2024

Fix regression on Processor.save_pretrained caused by #31691 #32921

Merged

ArthurZucker pushed a commit that referenced this pull request Aug 22, 2024

Fix regression on Processor.save_pretrained caused by #31691 (#32921)

273c0af

fix save_pretrained

ArthurZucker pushed a commit that referenced this pull request Aug 22, 2024

Fix regression on Processor.save_pretrained caused by #31691 (#32921)

6845144

fix save_pretrained

BernardZach pushed a commit to BernardZach/transformers that referenced this pull request Dec 5, 2024

Fix regression on Processor.save_pretrained caused by huggingface#3…

af9cc87

…1691 (huggingface#32921) fix save_pretrained

Blaizzy mentioned this pull request Feb 25, 2025

adding support for Qwen2.5-VL ml-explore/mlx-swift-examples#197

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llava: add default chat templates #31691

Llava: add default chat templates #31691

zucchini-nlp commented Jun 28, 2024

amyeroberts left a comment

amyeroberts Jun 28, 2024

zucchini-nlp Jun 28, 2024

amyeroberts Jun 28, 2024

HuggingFaceDocBuilderDev commented Jun 28, 2024

NielsRogge left a comment

zucchini-nlp commented Jun 28, 2024

zucchini-nlp commented Jul 10, 2024

amyeroberts left a comment

amyeroberts Jul 11, 2024

zucchini-nlp Jul 12, 2024

zucchini-nlp Jul 12, 2024

amyeroberts Jul 12, 2024

zucchini-nlp Jul 12, 2024

zucchini-nlp commented Jul 16, 2024

amyeroberts left a comment

amyeroberts Jul 17, 2024

zucchini-nlp Jul 17, 2024

zucchini-nlp Jul 18, 2024

amyeroberts Jul 18, 2024

		# load chat template from a separate json if exists
		# TODO @raushan: remove this hack after a few major releases

		text = reader.read()
		chat_template = json.loads(text)["chat_template"]

Llava: add default chat templates #31691

Llava: add default chat templates #31691

Conversation

zucchini-nlp commented Jun 28, 2024

What does this PR do?

amyeroberts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Jun 28, 2024

NielsRogge left a comment

Choose a reason for hiding this comment

zucchini-nlp commented Jun 28, 2024

zucchini-nlp commented Jul 10, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zucchini-nlp commented Jul 16, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment