-
Notifications
You must be signed in to change notification settings - Fork 23.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Python DataParallel RNN in no_grad mode #21197
Conversation
This is blocking facebookresearch/mmf/issues/76 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrshenli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: Retry #21197 The previous one failed because it uses some Python3 only syntax. ezyang Do we still have multi-GPU py2 tests? I am curious why the CI tests did not catch this error. Pull Request resolved: #21262 Differential Revision: D15598941 Pulled By: mrshenli fbshipit-source-id: 95f416589448c443685d6d236d205b011998a715
Is this fix part of 1.2? |
@BramVanroy yes, this made 1.2. |
Could you also modify the replicate part in The same problem still happens when I call |
Fixes #21108
When grad is disabled, Python autograd function outputs are wrapped as detached aliases, which prevents calling
Tensor.set_()
on them after recent changes in Tensors and Variables. This will hit a problem when users would like to callrnn.flatten_parameters()
in the forward pass, as the function callsset_()
.The proposed solution is to avoid using an autograd Broadcast if in no_grad mode.
@apsdehal