Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable non-functioning torch.compile #17

Merged
merged 4 commits into from
Mar 27, 2023
Merged

Conversation

carmocca
Copy link
Contributor

No description provided.

@carmocca carmocca marked this pull request as ready for review March 27, 2023 13:53
Comment on lines +19 to +20
# compilation fails as it does not support torch.complex64 for RoPE
# compile = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah this sucks :((

We could explore this in the future: Add a flag to our nano model that allows us to switch between the complex implementation and the previous real one. We would use the flag for comparision and inference using the meta checkpoints, but we would use the real real implementation when training from scratch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still want to be able to compile for training, as that's where the speedup becomes more valuable (less $$$ spent).

Maybe we should look into getting the non-complex implementation to work as expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I'm explaining above. We would compile the non-complex version for training, but we would use the other for inference where we load the meta checkpoint.

Maybe we should look into getting the non-complex implementation to work as expected.

I spent the whole weekend on this so yeah, it's not like we didn't try already xD

@@ -5,7 +5,7 @@
import lightning as L


@torch.inference_mode()
@torch.no_grad()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carmocca carmocca merged commit 587824c into main Mar 27, 2023
@carmocca carmocca deleted the carmocca/compile-disable branch March 27, 2023 14:02
gkroiz pushed a commit to gkroiz/lit-llama that referenced this pull request May 9, 2023
timothylimyl referenced this pull request in timothylimyl/lit-llama-qa May 21, 2023
@carmocca carmocca self-assigned this Nov 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants