Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetune lora max_seq_length error #1461

Closed
SergioG-M opened this issue Jun 5, 2024 · 4 comments
Closed

Finetune lora max_seq_length error #1461

SergioG-M opened this issue Jun 5, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@SergioG-M
Copy link

SergioG-M commented Jun 5, 2024

I am getting an error when running litgpt finetune_lora

At the beginning of training the max_seq_length is set to 466 because that is the longest sequence in my training set

"The longest sequence length in the train data is 466, the model's maximum sequence length is 466 and context length is 2048"

However, when the training is finished and a final validation is performed in

val_loss = validate(fabric, model, val_dataloader, dataclasses.replace(eval, max_iters=len(val_dataloader)))
I get an error
"Cannot forward sequence of length 473, max seq length is only 466"

There is a at least a sample in the validation set that is longer than the longest one in the training set Does anyone know how to fix this?

This is the traceback I get

File "/usr/local/lib/python3.10/dist-packages/litgpt/finetune/lora.py", line 215, in main
val_loss = validate(fabric, model, val_dataloader, dataclasses.replace(eval, max_iters=len(val_dataloader)))
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/litgpt/finetune/lora.py", line 353, in validate
logits = model(input_ids)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/lightning/fabric/wrappers.py", line 139, in forward
output = self._forward_module(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/litgpt/lora.py", line 527, in forward
raise ValueError(f"Cannot forward sequence of length {T}, max seq length is only {self.max_seq_length}.")
ValueError: Cannot forward sequence of length 473, max seq length is only 466.
@rasbt
Copy link
Contributor

rasbt commented Jun 5, 2024

Thanks for sharing. Yeah, this shouldn't happen, and the max sequence length calculation should happen on both the training and validation data not just the training data. Will have to look into this and update.

In the meantime, you could rerun the training with --train.max_seq_length 512 or so to make sure this doesn't happen in your case.

@rasbt rasbt added the bug Something isn't working label Jun 5, 2024
@SergioG-M
Copy link
Author

SergioG-M commented Jun 5, 2024

Thanks for sharing. Yeah, this shouldn't happen, and the max sequence length calculation should happen on both the training and validation data not just the training data. Will have to look into this and update.

In the meantime, you could rerun the training with --train.max_seq_length 512 or so to make sure this doesn't happen in your case.

Thanks!

Actually, I think that train.max_seq_length is not enough, the problem comes from

model.max_seq_length = min(longest_seq_length, train.max_seq_length or float("inf"))

So I just changed that in my case

@rasbt
Copy link
Contributor

rasbt commented Jun 5, 2024

Thanks, fixing it in #1462

@rasbt
Copy link
Contributor

rasbt commented Jun 5, 2024

Should be fixed now.

@rasbt rasbt closed this as completed Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants