-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Train] Support FSDP Strategy for LightningTrainer #34148
[Train] Support FSDP Strategy for LightningTrainer #34148
Conversation
Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
…p_for_lightningTrainer
Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
…p_for_lightningTrainer
Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
…p_for_lightningTrainer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Left some questions.
Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
…p_for_lightningTrainer
Any plans to add an example, like porting over this GPT fine tuning with PTL+fsdp example https://github.com/SeanNaren/minGPT/tree/fairscale and showing how to do this on multiple nodes? This would be a really great flagship example for this integration, and aligns well with our LLM initiative. |
Good idea. What do you think @matthewdeng ? |
Yeah, we should add an example for 2.5 😃 |
…p_for_lightningTrainer
…p_for_lightningTrainer
Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com> Signed-off-by: elliottower <elliot@elliottower.com>
Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com> Signed-off-by: Jack He <jackhe2345@gmail.com>
Why are these changes needed?
Previously, LightningTrainer only support DDP strategy for distributed training. This PR adds support for FSDP strategy.
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.