-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass missing arguments to SeamlessM4Tv2ConformerEncoderLayer.forward()
when gradient checkpointing is enabled
#31945
Pass missing arguments to SeamlessM4Tv2ConformerEncoderLayer.forward()
when gradient checkpointing is enabled
#31945
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ! Thanks for fixing this!
Could you also verify that this doesn't happen in the v1 of SeamlessM4T
?
@ylacombe turns out the same problem existed in v1 too 🤓 Just pushed a new commit fixing that |
Great! |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@anferico Thanks for adding this fix! This should be caught in tests. Could you check the test suit for this model to make sure the forward pass with gradient checkpoint is tested? |
@amyeroberts The gradient checkpointing-related tests are currently being skipped for both SeamlessM4T v1 and v2. Aside from that, this issue kinda calls for more robust tests, because the current test suite for gradient checkpointing only checks if gradients are computed for each parameter, but it could never have spotted the edge case where the |
@anferico OK, sounds like the work is a bit more involved. Could you open an issue detailing that the GC tests should be updated? This way we keep track of the work and we can merge this without being blocked |
@amyeroberts done! #32063 |
@anferico Awesome - thank you! |
What does this PR do?
Fixes a bug for which if gradient checkpointing is enabled,
SeamlessM4Tv2ConformerEncoderLayer.forward()
is called with some missing arguments.Fixes #31028
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@ylacombe