-
Notifications
You must be signed in to change notification settings - Fork 27.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BLIP: enable generation tests #34174
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
I'd like very much to avoid this change -- extra logic for all tests to handle a niche corner case. Let's brainstorm alternatives! |
@gante I tried to force OMG, I found an option while writing this reply, whisper and the other audio model are encoder-decoder so we can make it work by getting main input in decoder-only models. Just before the check happens, in the same indent block :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AH actually we might need / want to force return_dict
to TRUE, to avoid all the if elses
if it works, sounds good! (make sure to leave a comment) |
@gante requesting re-review, since the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for cleaning up
logits = outputs.logits if return_dict else outputs[1] | ||
loss = outputs.loss | ||
logits = outputs.logits | ||
outputs = outputs.to_tuple() if not return_dict else outputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outputs will have loss and logits twice there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this was probably already a bug)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, in general i don't like that we return it as this and would better return unwrapped lm outputs. But we can't prob just delete it for BC reasons
models_without_standard_cache = ( | ||
"ctrl", | ||
"fsmt", | ||
"gptbigcode", | ||
"mega", | ||
"reformer", | ||
"jamba", | ||
"mamba", | ||
"xlnet", | ||
"zamba", | ||
) | ||
has_standard_cache = not any( | ||
model_name in config.__class__.__name__.lower() for model_name in models_without_standard_cache | ||
) | ||
if has_standard_cache: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when we overwrite we don't need that no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, will remove unnecessary part
* blip2 tests * instructblips * copies * fix slow tests * fix * uncomment this * clean up after rebase * should be model main input * fix overwritten tests * oops len should be multiple of frame number * style * fix some tests
What does this PR do?
Enables generation tests for BLIP models, except BLIP-1 (turned out to be a bit harder). I changed the generation tests to use
modelTest.input_name
as BLIP is the only model that uses pixel values as main input and thus checking generated text length's will always fail.I tried to get rid of custom generate for these models, but that opened a Pandora box so I think better not waste time on an old model and maintain it for a while, until the model gets deprecated. But still I did some changes so we don't need to add extra
bos
at the beginning and now the decoder-based BLIP models return full text at output. Encoder-decoder based models return only generated text, which is consistent with what an LLM should do