Basic training script for LLaMA #7

awaelchli · 2023-03-24T05:19:05Z

Adds a basic training script for LLaMA 7B on the shakespeare dataset.

lantiga · 2023-03-24T13:35:57Z

As I'm working at addressing Carlos' comments and make improvements, I'll also remove the GPL implementation.
Can you maybe update the training code to use the model in the nano directory?

models/nano/model.py

awaelchli · 2023-03-24T13:40:13Z

@lantiga Updated! I ran the two side by side and get these loss values:

Nano:
iter 0: loss 10.3979, time: 11945.28ms
iter 1: loss 10.5749, time: 5160.31ms
iter 2: loss 8.6790, time: 4872.23ms
iter 3: loss 6.8088, time: 5174.96ms
iter 4: loss 6.8616, time: 5044.30ms


Old:
iter 0: loss 10.3839, time: 13507.87ms
iter 1: loss 10.9712, time: 5206.71ms
iter 2: loss 8.3166, time: 4857.47ms
iter 3: loss 6.7234, time: 5104.52ms
iter 4: loss 6.8219, time: 5036.32ms

There is a small difference where I'm not sure where it is coming from yet.

train.py

models/llama/__init__.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

data/shakespeare/prepare.py

lantiga

Looks great! Merging

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

awaelchli added 6 commits March 24, 2023 01:13

basic train script

b98d446

todo

1ec0326

clean up

fba04ac

clean up

067488d

clean up more

7c5a023

update

9a7e731

awaelchli marked this pull request as ready for review March 24, 2023 10:58

awaelchli added 2 commits March 24, 2023 07:11

defaults

23d463f

Merge branch 'main' into train

73aff86

use nano model

67f76d5

awaelchli commented Mar 24, 2023

View reviewed changes

models/nano/model.py Outdated Show resolved Hide resolved

Update models/nano/model.py

9996ec1

carmocca reviewed Mar 24, 2023

View reviewed changes

train.py Show resolved Hide resolved

train.py Outdated Show resolved Hide resolved

train.py Outdated Show resolved Hide resolved

train.py Outdated Show resolved Hide resolved

models/llama/__init__.py Outdated Show resolved Hide resolved

awaelchli and others added 5 commits March 24, 2023 10:15

Update train.py

0d70af1

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

Update train.py

7875b55

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

bugfix

08f15bc

Merge branch 'main' into train

83ae1b6

fixes for lucas model changes

e2a707a

lantiga reviewed Mar 24, 2023

View reviewed changes

data/shakespeare/prepare.py Outdated Show resolved Hide resolved

awaelchli added 2 commits March 24, 2023 17:49

Merge branch 'main' into train

d76a250

parameteriz

097f1a7

lantiga approved these changes Mar 25, 2023

View reviewed changes

lantiga merged commit ab974fb into main Mar 25, 2023

lantiga deleted the train branch March 25, 2023 13:21

timothylimyl referenced this pull request in timothylimyl/lit-llama-qa May 21, 2023

Basic training script for LLaMA (#7)

b0191c6

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Basic training script for LLaMA #7

Basic training script for LLaMA #7

awaelchli commented Mar 24, 2023 •

edited

Loading

lantiga commented Mar 24, 2023

awaelchli commented Mar 24, 2023

lantiga left a comment

Basic training script for LLaMA #7

Basic training script for LLaMA #7

Conversation

awaelchli commented Mar 24, 2023 • edited Loading

lantiga commented Mar 24, 2023

awaelchli commented Mar 24, 2023

lantiga left a comment

Choose a reason for hiding this comment

awaelchli commented Mar 24, 2023 •

edited

Loading