-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xformers attention #1851
xformers attention #1851
Conversation
I'm having trouble finding information on this but does this inadvertently kill Linux AMD support as a new default? I'm not certain xformers can be compiled for ROCM. |
Yeah, it likely would. We could add another check to the if statement for ROCm. Not sure PyTorch has that, I'll look at it though. |
Yeah totally agree - we are working on something here for linux tho, but no plans at the moment for windows cc @bottler
That's not something we are supporting indeed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just added a few comments to simplify the code - this looks great otherwise :)
Something I've seen in some colabs is downloading a pre-compiled version of xformers, is this a viable way to distribute xformers here too? |
It is - we can't use those exact ones as they were built for Linux - but it still has to be built & distributed, something I have no experience in. |
Has anyone tried running xFormers through hipify yet? Google gave me nothing, and I don't have CUDA set up to try myself right now. |
Here are windows xformers for 3.9 https://github.com/neonsecret/xformers/releases/tag/v0.14 To get a wheel just do
|
I wonder if xformers could be combined with AITemplate #1625 for 15% * 200% * 250% speed boost |
@Thomas-MMJ I think seperate wheels are needed for different GPU archs. Official builds of xformers build seperate wheels. Example: https://app.circleci.com/pipelines/github/facebookresearch/xformers/2900/workflows/5c5de2be-9557-4684-9d10-34cd3835663e I could provide the compute (build it locally) if somebody's willing to setup the workflow. |
Could somebody please test this?
This exact same code, that produced broken images yesterday, now works for some reason.... still no clue why it failed yesterday or why it suddenly works now. |
Works for me |
I can now reach 22it/s with the newest version and batch size 8 with a 3080 12GB. Just need to find a way to distribute packages to people and we can ship this to everyone. |
This comment was marked as resolved.
This comment was marked as resolved.
Looks like conda will be added to their continuous integration, https://github.com/facebookresearch/xformers/pull/466/files |
22it/s?!? With batch size 8?! You did not misspell, right? You didn't mean batch count 8 or 2.2it/s? |
So that is without the init? I thought it was the lack of init that was the issue yesterday. (Of course that was pure speculation...) |
With a 3080, they definitely meant one or the other, not both at the same time. |
@SafentisFox @Thomas-MMJ @AUTOMATIC1111 |
@htkg We limited it to Ampere as my wheels only work with Ampere. Hopefully Meta will distribute wheels for Windows, and we can remove a lot (nearly all, actually) of these checks. |
@chekaaa ramp up the batch size for larger gains |
What do I need to do to get this to work? Just add |
@leohumnew yes |
Doesn't appear to auto-install xformers Console log: venv "F:\StableD\stable-diffusion-automatic1111\venv\Scripts\Python.exe" |
Same here, #1851 (comment) got it working for me |
Sounds like you've got Python 3.9 installed - that whl won't work for me - but this one did: And it's crazy fast. @C43H66N12O12S2 thank you for your work on this PR. Keep flying high :) |
Is there any decrease in quality with this? Or should it be equivalent to without, but just a bit faster? |
So far my testing shows same quality (have only tested a handful of samplers w/ it) |
I'm running a 3090 and noticing about a 20-30% speed up, good stuff. I have however noticed a strange issue, repeat generations with the same params can give different results. I've created an issue about it. I wonder if it's just me or if others have noticed the same thing. |
Is there a guide on how to DIY the wheels for this on your local computer? I'm running an outdated Maxwell GPU but I still want to try this out anyway. |
I've built it for t4 and for p100 but only on a colab environment. It's literally a pip install command once you have the build environment setup. I expect that it's not too different on windows. Hope you have a lot of time though. Took about 45 minutes to compile and the first time it failed lol. |
Trying to get a prebuilt xformers running on Windows x64 / Python 3.9 with a 3070. Running
Running Running
Any fixes for this? |
Actually I just found a guide for windows on reddit. It's actually a little bit more involved than that, but it sounds doable. |
@htkg @duckness @githubartman @ilcane87 could you please test this wheel? https://github.com/C43H66N12O12S2/stable-diffusion-webui/releases/download/c/xformers-0.0.14.dev0-cp310-cp310-win_amd64.whl |
it works for me (1070) |
@C43H66N12O12S2 |
@C43H66N12O12S2 i've built wheels by myself, so I cant test it, I think... p.s building wheels took 15m on rtx super 2060 got 30-40% boost which is awesome |
will try after go to pc if nobody will test on rtx 2060 before |
Closing the loop on this... ended up following the instructions to build xformers locally... was trying to avoid the 3GB of CUDA / 7GB of VC++ dev libraries, but oh well. Worked first time after |
To create a wheel, do
and there will be a wheel put in your dist folder. You can share it and/or keep it around to reinstall later. |
I just wanted to say that the new updates solved my problems, and I am really grateful for that. It was a frustrating experience... thanks to the devs who made it possible. If I only waited long enough I wouldnt have to be in a battle with the cmd all day long... |
There are now official conda linux xformers builds
|
Working with python 3.10.8, was not working with 3.9.5 with ModuleNotFoundError: No module named 'xformers' error. |
The xformers wheel you download has to match your python version (3.8/3.9/3.10) and your cuda version (11.6/11.7/11.8) - if either mismatches it won't work. |
any success running xformers under WSL? |
yeah xformers works great for me under wsl. If you don't want to build from source, you can use official ones for some python and CUDA and pytorch combinations,
Originally posted by @danthe3rd in huggingface/diffusers#532 (comment) Note that to use deepspeed pinning (used for dreambooth ) under WSL you need the Windows 22H2 (released a week ago) and updated wsl (wsl --update), otherwise it is limited to pinning 2 GB of RAM (for dreambooth it wants to pin 16GB) |
This PR adds xformer optimized cross-attention, a flag to disable it and use split instead, _maybe_init function that - for some reason - seems to be necessary for xformers to work in this instance and enables functorch in xformers, which further increased performance on my machine.
We still need a way for easy distribution of xformers. Otherwise, this PR is good to go (barring bugs I've not been able to perceive)
cc. @Doggettx @Thomas-MMJ @ArrowM @consciencia
PS. Much thanks to @fmassa @danthe3rd @yocabon and many others for their generous efforts to bring xformers to Windows.
I've seen a %15 improvement with batch size 1, 100 steps, 512x512 and euler_a. xFormers allows me to output 2048x2048 whereas I would previously OOM.
closes #576