Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADD] Class for treating flattened embedding weights with a pre-conditioner #36

Merged
merged 4 commits into from
Nov 4, 2024

Conversation

f-dangel
Copy link
Owner

As requested by @yorkerlin, this PR adds the functionality to treat flattened weight matrices of embedding layers.
With this option, we can reshape W (2d) into vec(W) (1d), and use a pre-conditioner that consists of a single Kronecker factor (usually equipped with diagonal structure). This allows training embedding layer weights with inverse- and root-free RMSProp.

@f-dangel f-dangel requested a review from runame October 24, 2024 19:34
Copy link
Collaborator

@runame runame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

It would be good to add a unit test for FlattenEmbedding.

@f-dangel
Copy link
Owner Author

f-dangel commented Nov 4, 2024

Added tests, thanks!

Copy link
Collaborator

@runame runame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks!

@f-dangel f-dangel merged commit 3803ed7 into main Nov 4, 2024
14 checks passed
@f-dangel f-dangel deleted the flatten-embedding-weights branch November 4, 2024 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants