-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Paligemma: fix generation with Gemma2 #36044
Paligemma: fix generation with Gemma2 #36044
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can just use kwargs no?
I think making it explicit that kwargs will be used by an only an LM was better |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine with me! Thanks a lot!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's say that an integration test is most welcome as well!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, was quite low in priority for the patch so I decided to skip it for now :)
For transparency, this commit needs to be modified for the patch, only applying changes for PaliGemnma2 |
* fix paligemma * nit * use `kwargs` in models that can load any LM * update changes to only affect Paligenma
* fix paligemma * nit * use `kwargs` in models that can load any LM
* fix paligemma * nit * use `kwargs` in models that can load any LM
* fix paligemma * nit * use `kwargs` in models that can load any LM
What does this PR do?
Fixes #36029 and adds tests for the model, imo we need tests with different LM backbone because Gemma-2 is special
This is a quick fix but I think we should make this kind of fix on LM work out-of-the-box, by adding it as
kwargs
for example. Most LMs acceptloss_kwargs
thus we can make all multimodal models also accept kwargs that are simply passed further to the LM. WDYT?