Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FedOpt with batchnorm #1851

Merged
merged 4 commits into from
Jul 12, 2023
Merged

Conversation

holgerroth
Copy link
Collaborator

Fixes #1718.

Description

Support using FedOpt with models containing BatchNorm layers. BN layers are not included in self.model.named_parameters() and are not updated by the optimizer. With this PR, BN layers are updated using FedAvg.

By setting lr=1 and momentum=0 with no learning rate scheduler, FedOpt achieves the same global model performance as FedAvg using torchvision's mobilenet_v2, as expected
image

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Quick tests passed locally by running ./runtest.sh.
  • In-line docstrings updated.
  • Documentation updated.

@holgerroth holgerroth force-pushed the fedopt_with_batchnorm branch from 04edcaf to 4465fde Compare July 12, 2023 02:18
@holgerroth
Copy link
Collaborator Author

/build

@YuanTingHsieh
Copy link
Collaborator

@holgerroth thank you for the fix.

One thing here, since now your implementation differs from original paper (as you have this workaround to support batch norm)
Could we add this information to the Class's docstring to make it clear?

Thanks!

@holgerroth
Copy link
Collaborator Author

@holgerroth thank you for the fix.

One thing here, since now your implementation differs from original paper (as you have this workaround to support batch norm) Could we add this information to the Class's docstring to make it clear?

Thanks!

Good point. Added a description of the new behavior.

@holgerroth
Copy link
Collaborator Author

/build

@holgerroth
Copy link
Collaborator Author

/build

@holgerroth holgerroth enabled auto-merge (squash) July 12, 2023 21:59
@holgerroth
Copy link
Collaborator Author

/build

1 similar comment
@holgerroth
Copy link
Collaborator Author

/build

Copy link
Contributor

@guopengf guopengf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@holgerroth holgerroth merged commit 75536cd into NVIDIA:dev Jul 12, 2023
@holgerroth holgerroth deleted the fedopt_with_batchnorm branch July 12, 2023 22:17
holgerroth added a commit to holgerroth/NVFlare that referenced this pull request Dec 4, 2023
* support FedOpt with batch norm layers

* support filtered model diff
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] FedOpt algorithm not working as expected in cifar10 example
3 participants