Fix number of patch check for different vision feature select strategy #32494

insujang · 2024-08-07T13:37:21Z

What does this PR do?

Fixes #32395 (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

I think a new test for this change needs to be added (using parameterized to test different vision_feature_select_strategy and vision_feature_layer), however, I have no idea how the new test would look like. With this change there will be no errors during forward, but do we also need to check the outputs with expected ones? Which test class would be suitable to have this test? @zucchini-nlp Appreciate it if you could share your thoughts regarding this. Thanks!

Who can review?

@zucchini-nlp

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

zucchini-nlp

Cool, thanks for working on this!

Yes, we'd need tests and you can add them in

transformers/tests/models/llava_next/test_modeling_llava_next.py

Line 211 in e0d8253

    
           class LlavaNextForConditionalGenerationModelTest(ModelTesterMixin, GenerationTesterMixin, unittest.TestCase):

Basically we'll run a tiny model with dummy inputs, and verify that changing select_strategy run without errors. I don't think we need slow tests here, just make sure after adding the test to run

RUN_SLOW=1 pytest tests/models/llava_next/test_modeling_llava_next.py

and check that all tests pass. If something doesn't pass, just let me know. Might be numerical precision error from different hardware as the PR doesn't change the model logic

zucchini-nlp · 2024-09-13T08:21:37Z

@insujang hey, is the PR in progress or do you need help with adding tests? I can do the test part if you are busy :)

insujang · 2024-09-13T13:36:43Z

Hi @zucchini-nlp , I am sorry but I am too busy to work on it :( I would sincerely appreciate it if you could do. Thank you!

zucchini-nlp · 2024-09-13T14:03:20Z

@insujang no worries, thanks a lot for contribution. Will add a test soon and merge it to main 🤗

zucchini-nlp

Perfect, LGTM! ❤️

I added a test as @insujang is busy and now this can be reviewed

LysandreJik

Feel free to merge if you disapprove the comment below

LysandreJik · 2024-09-17T00:15:24Z

src/transformers/models/llava_next/modeling_llava_next.py

@@ -648,7 +648,7 @@ def _merge_input_ids_with_image_features(

        return final_embedding, final_attention_mask, position_ids, final_labels, final_input_ids

-    def pack_image_features(self, image_features, image_sizes, image_newline=None):
+    def pack_image_features(self, image_features, image_sizes, vision_feature_select_strategy, image_newline=None):


Should it have "default" as a default kwarg value so as to not change the signature and keep usages of pack_image_features outside of the model working?

(Not necessary if this method should be considered internal)

yes, should be internal as it doesn't make much sense outside of LLaVA

huggingface#32494) * Fix number of patch check for different vision feature select strategy * add test --------- Co-authored-by: raushan <raushan@huggingface.co>

Fix number of patch check for different vision feature select strategy

8f08670

zucchini-nlp reviewed Aug 7, 2024

View reviewed changes

zucchini-nlp added 2 commits September 16, 2024 13:25

merge main

bc68aec

add test

08fc7d8

zucchini-nlp approved these changes Sep 16, 2024

View reviewed changes

zucchini-nlp requested a review from LysandreJik September 16, 2024 11:35

LysandreJik approved these changes Sep 17, 2024

View reviewed changes

zucchini-nlp merged commit bcf8946 into huggingface:main Sep 17, 2024
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix number of patch check for different vision feature select strategy #32494

Fix number of patch check for different vision feature select strategy #32494

insujang commented Aug 7, 2024

zucchini-nlp left a comment

zucchini-nlp commented Sep 13, 2024

insujang commented Sep 13, 2024

zucchini-nlp commented Sep 13, 2024

zucchini-nlp left a comment

LysandreJik left a comment

LysandreJik Sep 17, 2024

zucchini-nlp Sep 17, 2024

Fix number of patch check for different vision feature select strategy #32494

Fix number of patch check for different vision feature select strategy #32494

Conversation

insujang commented Aug 7, 2024

What does this PR do?

Before submitting

Who can review?

zucchini-nlp left a comment

Choose a reason for hiding this comment

zucchini-nlp commented Sep 13, 2024

insujang commented Sep 13, 2024

zucchini-nlp commented Sep 13, 2024

zucchini-nlp left a comment

Choose a reason for hiding this comment

LysandreJik left a comment

Choose a reason for hiding this comment

LysandreJik Sep 17, 2024

Choose a reason for hiding this comment

zucchini-nlp Sep 17, 2024

Choose a reason for hiding this comment