Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] FSDP-QLoRA #1148

Merged
merged 2 commits into from
Apr 5, 2024

Conversation

stevhliu
Copy link
Contributor

This PR describes how bnb supports FSDP-QLoRA - mainly through the selectable quantization storage parameter - and provides code examples for setting up training with Transformers/PEFT/TRL. The docs are fairly lightweight since it is covered in more depth and detail in Answer.AI's technical blog post.

Copy link

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Titus-von-Koeller
Copy link
Collaborator

Ok, just finished proof-reading. Looks super good, no corrections needed!

For a moment, I was thinking it might be good to add a mention of FSDP's FlatParameters and how they need a uniform dtype in the tensors being wrapped for sharding. The more uniform your weights are, the larger the groups of parameters in the module tree that you can wrap in a single FlatParameter, which is key to optimize the sharding process.

At second thought, this doesn't really add anything for the average user and if they were interested, the references provided explain in ample detail.

Thanks a lot for this very good work 🤗 ! Really thorough and at the right level of detail for the circumstances.

@Titus-von-Koeller Titus-von-Koeller merged commit 0c64a0d into bitsandbytes-foundation:main Apr 5, 2024
2 checks passed
@stevhliu stevhliu deleted the fsdp-qlora branch April 8, 2024 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants