-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mark test_eager_matches_sdpa_generate
flaky for some models
#29479
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should not be flaky, I think it means there is an issue in the implemented sdpa vs eager! Or do they pass with 3 attempts?
Apart from qwen2 (that is bugged and I did not work on), other tests are simply flacky. In my experience (for inference) this had no impact on e.g. generation. see pytorch/pytorch#103462 for numerical differences (although for gradients there) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright thanks then!
I will remove the one for |
For qwen2 I think we should find which one of eager/sdpa is correct and maybe disable the one that is not until fixed :/ |
I can skip it for now. |
What does this PR do?
cc @fxmarty