-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: avoid human error, automatically infer generative models #33212
CI: avoid human error, automatically infer generative models #33212
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Super cool! If VLMs start failing on generation tests, that's okay. I have a local draft for that, which needs to be reviewed after a few other PRs are merged |
489a2d6
to
1e8e794
Compare
Many side-fixes later, ready for review 🙌 |
tests/utils/test_modeling_tf_core.py
Outdated
@@ -67,7 +67,6 @@ | |||
class TFCoreModelTesterMixin: | |||
model_tester = None | |||
all_model_classes = () | |||
all_generative_model_classes = () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we don't need def all_generative_model_classes
for TF ..?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incorrectly moved
(I tried to apply the same pattern on TF, but it has too many broken models. Then I reverted the changes. This one was incorrectly reverted, I'm going to chase down the TF diff to 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left 3 nit question but yeah thank you!
I see a few files have style changes: I always think they should be fixed in a separate PR, but you can keep them here if you wish.
@ydshieh all comments addressed 🤗 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you again.
Just to make sure the change in tests/test_modeling_tf_common.py
is what you expect:
- removing
all_generative_model_classes = ()
OK - not adding
def all_generative_model_classes(self):
: if this is intended, OK, for you to check
ah , I see you put it back :-), all good then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this hopefully helps us prevent cases when generation tests are not added.
I am not sure I got why audio models generation is turned off, if it was enabled before and CI was green? Same question for Blip2
# Doesn't run generation tests. TODO: fix generation tests for Blip2ForConditionalGeneration | ||
all_generative_model_classes = () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was working on main
for Blip2ForConditionalGeneration
, are there many failures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we remove this line (= if we don't skip the tests), py.test tests/models/blip_2/test_modeling_blip_2.py
rebased on main
results in 21 failures :P
I've also double-checked the other models with skips on Monday. Most of them have unique model properties that do not work well with generate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, note that there are two testers (Blip2ForConditionalGenerationDecoderOnlyTest
and Blip2ModelTest
), the skips are only on the latter. I don't know why the latter needs to skip, but that's beyond the scope of this PR :P
They were also being skipped before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on main
, Blip2ModelTest
doesn't have all_generative_model_classes
. which means it doesn't run generate tests, and this PR doesn't skip any extra tests for this test class
(I don't know why however)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks! I believe something similar for audio model since for me the skip comments weren't very clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that's why I ask frequently more (detailed) comments in many PRs I reviewed 😆
@gante you can ping me to merge if the CI persists to be red but irrelevant to your PR :-) |
@ydshieh yes, please merge 🙏 (was trying to be autonomous :D) |
What does this PR do?
This PR:
generate
with an automatic definition -- ifmodel_class.can_generate()
, then it runs tests fromGenerationTesterMixin
. No more human mistakes, which happened frequently in the past months 🐛 🔫all_generative_model_classes
and explain why certain classes are being skipped. (bad apples detected withpy.test tests/models/ -k test_greedy_generate -vv
) 💔In a follow-up PR:
We need to manually define the model's main input name (e.g.,Done ✅pixel_values
) in the model tester. Make it usemodel.main_input_name
instead, to avoid human errorgenerate
tests will only run ifGenerationTesterMixin
is inherited. We can easily forget to add the mixin, resulting in a false positive. Add an automated check: if any of the model classes can generate, thenGenerationTesterMixin
must be inherited in the tester