-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove model.AttnBlock and replace with attention.SpatialSelfAttention. #519
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you might want to make a separate pull request for the variable cleanup and the switch from AttnBlock to SpatialSelfAttention, since they are unrelated pull requests.
Good point, del cleanup done in #520 |
@Any-Winter-4079 Hey, I'm doing a cleanup prompted by experimenting to get dream running on just an intel CPU. Turns out model.AttnBlock is a duplicate of attention.SpatialSelfAttention and can just be removed. Bringing it up since you've made changes to it and may have comments/want to review. |
Any code that can be deleted is a great step forward. I'll hold off on reviewing this until you've cleared draft status. |
I approved and merged this. Just waiting on a review from someone with appropriate expertise. |
Interestingly as long as --full_precision is specified, current development branch seems to run on CPU-only intel hardware. |
2acea38
to
42441a0
Compare
Yes, the final fix that enabled that was #518 |
Need to do a few more runs, but seeing some fishy results here, second and third iter running at 80% slower
|
Does it do the same at development head (after applying the >=8 fix locally)? |
I knew this rang a bell, we've sen it before will changes to Attn See and read down the conversation. |
the version checked from development branch runs normally
Looking at the code, you seems to have replaced the memory optimised model.AttnBlock with |
Thanks for testing. |
…nd replace with attention.SpatialSelfAttention)
model.AttnBlock and attention.SpatialSelfAttention make the exact same computation but in different ways. Delete the former as model already imports from attention. Apply same softmax dtype optimization as in invoke-ai#495. Note that from testing, tensors in SpatialSelfAttention are significantly smaller so they don't need the same involved memory split optimization as attention.CrossAttention. Tested that max VRAM is unchanged.
Still seeing a big slowdown in extra iters. Current development
With this PR
|
Wow you were quick to test after the rebase. Thank you. I wasn't even sure I was going to pursue this. |
You timed it well, gave me something to do while waiting for my supper to cook :-) Was wondering if there was any benefit from going the other way and using the AttnBlock code in SpatialSelfAttention
Marginally faster, but could be variance, any idea if SpatialSelfAttention is called directly ? |
SpatialSelfAttention is unused for txt2img. Probably for other flows too but haven't checked. |
okay thanks for the reply. |
model.AttnBlock and attention.SpatialSelfAttention make the exact same computation but in different ways so just delete the former. All SpatialSelfAttention optimizations now apply for free.
Also remove pointless del statements.
Please approve #518 first (the other way would lead to conflicts).