-
Notifications
You must be signed in to change notification settings - Fork 433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update mosaic_fsdp_utils.py #3185
Conversation
Porting mosaicml/llm-foundry#1104 to composer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Assuming you tested?
Not yet. |
(I set really short warmups which explains some of the "loss spikes" that come back down) ![]() from here tldr lgtm |
Porting mosaicml/llm-foundry#1104 to composer.
Porting mosaicml/llm-foundry#1104 to composer.
This reverts commit 2a262b4.
Porting mosaicml/llm-foundry#1104 to composer.
Porting mosaicml/llm-foundry#1104 to composer.
What does this PR do?
Updates how a new process group is initialized. This version was broken in some version of torch1.XX, but has been working well in torch2+
What issue(s) does this change relate to?
Before submitting
pre-commit
on your change? (see thepre-commit
section of prerequisites)