-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set find_unused_parameters=False
as the default
#16611
Conversation
for more information, see https://pre-commit.ci
⚡ Required checks status: All passing 🟢Groups summary🟢 pytorch_lightning: Tests workflowThese checks are required after the changes to 🟢 pytorch_lightning: Azure GPU
These checks are required after the changes to 🟢 pytorch_lightning: Benchmarks
These checks are required after the changes to 🟢 pytorch_lightning: Azure HPU
These checks are required after the changes to 🟢 pytorch_lightning: Azure IPU
These checks are required after the changes to 🟢 pytorch_lightning: Docs
These checks are required after the changes to 🟢 mypy
These checks are required after the changes to 🟢 installThese checks are required after the changes to 🟢 link-check
These checks are required after the changes to Thank you for your contribution! 💜
|
What does this PR do?
Fixes #7330
Fixes #14486
Closes #12445
Part of #12398
The default value for DDP's
find_unused_parameters
in PyTorch isFalse
. So far, Lightning has always flipped that toTrue
because it tried to minimize the confusion of users who would otherwise see an error from PyTorch:This error would often occur when users dealt multiple optimizers (GAN) etc. After #16539, this issue is less significant. However, the drawback of setting
find_unused_parameters
is that you compromise on performance. PyTorch shows a warning:But obviously, the warning is pointing at the PyTorch DDP API that the Lightning user doesn't have access to. Thus, the warning is not very helpful.
Since Trainer 2.0 favors performance and parity with PyTorch, we will switch
find_unused_parameters
toFalse
to align with PyTorch's defaults. Discussion from the past: #6219Does your PR introduce any breaking changes? If yes, please list them.
In summary, these are the possible configurations a user can run into:
Case 1: User has unused parameters in their forward-backward
strategy=ddp_find_unused_parameters=True
.Case 2: User has no unused parameters in their forward-backward
ddp_find_unused_parameters=True
was the default, unexpected performance hitA breaking change occurs for users of Case 1 when switching to >= 2.0.
The rest of users see one less warning and a speed up in their training.
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
I made sure I had fun coding 🙃
cc @Borda @justusschock @awaelchli