-
Notifications
You must be signed in to change notification settings - Fork 519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable non-functioning torch.compile #17
Conversation
# compilation fails as it does not support torch.complex64 for RoPE | ||
# compile = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah this sucks :((
We could explore this in the future: Add a flag to our nano model that allows us to switch between the complex implementation and the previous real one. We would use the flag for comparision and inference using the meta checkpoints, but we would use the real real implementation when training from scratch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still want to be able to compile for training, as that's where the speedup becomes more valuable (less $$$ spent).
Maybe we should look into getting the non-complex implementation to work as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I'm explaining above. We would compile the non-complex version for training, but we would use the other for inference where we load the meta checkpoint.
Maybe we should look into getting the non-complex implementation to work as expected.
I spent the whole weekend on this so yeah, it's not like we didn't try already xD
@@ -5,7 +5,7 @@ | |||
import lightning as L | |||
|
|||
|
|||
@torch.inference_mode() | |||
@torch.no_grad() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No description provided.