Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic training script for LLaMA #7

Merged
merged 17 commits into from
Mar 25, 2023
Merged

Basic training script for LLaMA #7

merged 17 commits into from
Mar 25, 2023

Conversation

awaelchli
Copy link
Contributor

@awaelchli awaelchli commented Mar 24, 2023

Adds a basic training script for LLaMA 7B on the shakespeare dataset.

@awaelchli awaelchli marked this pull request as ready for review March 24, 2023 10:58
@lantiga
Copy link
Collaborator

lantiga commented Mar 24, 2023

As I'm working at addressing Carlos' comments and make improvements, I'll also remove the GPL implementation.
Can you maybe update the training code to use the model in the nano directory?

@awaelchli
Copy link
Contributor Author

@lantiga Updated! I ran the two side by side and get these loss values:

Nano:
iter 0: loss 10.3979, time: 11945.28ms
iter 1: loss 10.5749, time: 5160.31ms
iter 2: loss 8.6790, time: 4872.23ms
iter 3: loss 6.8088, time: 5174.96ms
iter 4: loss 6.8616, time: 5044.30ms


Old:
iter 0: loss 10.3839, time: 13507.87ms
iter 1: loss 10.9712, time: 5206.71ms
iter 2: loss 8.3166, time: 4857.47ms
iter 3: loss 6.7234, time: 5104.52ms
iter 4: loss 6.8219, time: 5036.32ms

There is a small difference where I'm not sure where it is coming from yet.

awaelchli and others added 5 commits March 24, 2023 10:15
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Copy link
Collaborator

@lantiga lantiga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Merging

@lantiga lantiga merged commit ab974fb into main Mar 25, 2023
@lantiga lantiga deleted the train branch March 25, 2023 13:21
timothylimyl referenced this pull request in timothylimyl/lit-llama-qa May 21, 2023
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants