-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT / Optim: Add GaLore optimizer #29588
Merged
younesbelkada
merged 44 commits into
huggingface:main
from
younesbelkada:add-galore-optimizer
Mar 19, 2024
Merged
Changes from 1 commit
Commits
Show all changes
44 commits
Select commit
Hold shift + click to select a range
b31ce79
add galore v1
younesbelkada 58169f1
add import
younesbelkada 9032635
add tests and doc
younesbelkada 136f104
fix doctest
younesbelkada a5483b3
forward contrib credits from discussions
887d3ad
forward contrib credits from discussions
d6f119f
Apply suggestions from code review
younesbelkada 3fae229
Merge remote-tracking branch 'upstream/main' into HEAD
younesbelkada c8c50f8
fix failing tests'
younesbelkada 2bdda68
Merge remote-tracking branch 'upstream/main' into add-galore-optimizer
younesbelkada 630bd13
switch to `optim_target_modules` and clarify docs
younesbelkada a871b75
more clarification
younesbelkada cb6cd7e
Merge remote-tracking branch 'upstream/main' into add-galore-optimizer
younesbelkada 51b7b29
enhance lookup logic
younesbelkada 3da3b90
update a test to add peak memory
younesbelkada 9115c94
add regex, all-linear and single string support
younesbelkada 0b4ba83
add layer-wise optimization through DummyOptimizers and LRSchedulers
younesbelkada 3e5930e
forward contrib credits from discussions and original idea
hiyouga a16d3a8
add a section about DDP not supported in layerwise
younesbelkada 29e7e94
Update src/transformers/trainer.py
younesbelkada 18ea144
fix self
younesbelkada 7800bf1
check only if layer_wise
younesbelkada e022bdd
Update src/transformers/training_args.py
younesbelkada 830c68d
oops
younesbelkada b640e98
make use of intervals
younesbelkada 14a89b2
clarify comment
younesbelkada 6f7102d
add matching tests
younesbelkada c11cb63
GaLoRe -> GaLore
younesbelkada 3678201
move to `get_scheduler`
younesbelkada fdc4b2a
add note on docs
younesbelkada e7ce9b7
add a warning
younesbelkada 91d6436
adapt a bit the docs
younesbelkada b9e338a
update docstring
younesbelkada 6ff3762
support original API
younesbelkada 0d0440a
Update docs/source/en/trainer.md
younesbelkada 832f2be
slightly refactor
younesbelkada 898a3c5
Update docs/source/en/trainer.md
younesbelkada ed3ad4a
Update src/transformers/training_args.py
younesbelkada 57e7096
fix args parsing and add tests
younesbelkada 64ccfa6
remove warning for regex
younesbelkada 4413f07
Merge remote-tracking branch 'upstream/main' into add-galore-optimizer
younesbelkada 73dcabb
fix type hint
younesbelkada 1987b7a
add note about extra args
younesbelkada db2bf21
make `is_regex` return optional
younesbelkada File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GaLore has released an official package:
pip install galore-torch
https://github.com/jiaweizzhao/GaLore?tab=readme-ov-file#install-galore-optimizer