-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: OOMs with c78b496 #358
Comments
Thanks for doing the bisect. It's a pain with the wait for model initialization. I have a theory about where the OOM is coming from and will look into it before bed tonight. |
Can confirm. There's a memory overload somewhere. My max res is generally 512x768 but can only do 512x704 now. |
I'm pretty sure that it's the variation code that just went in. Peak VRAM usage has jumped up. I'm trying to isolate the problem to understand it. Teaches me a lesson about announcing a release so soon after a major update. |
Surprisingly it started working at the max res again without me doing anything. I'm trying to isolate the problem too but nothing stands out at first glance. |
Meh. Live and learn; managing software projects is hard 😀 We should probably institute some kind of testing sign-off, like ask people to volunteer to test, and require at least one ACK per platform. |
I've been testing using the "Max VRAM" line in the usage stats and the memory regression is definitely occuring at 4fe2657, which is where the variant code was folded in. For a 512x768 image, it took 7.10G of VRAM before the commit, and 7.15G afterward. Not much, but enough to push a 8G card over the edge. Probably the VRAM is being used for system graphics as well, which is why it seemed to cure itself after a while. I do not understand why the code is causing extra RAM usage when it is not being run, but I'll get it figured out. |
@bakkot There is a small but significant memory usage regression that appeared when "bakkot-seed-fuzz" was merged into development. The specific commit is 2d65b03. Before the commit, the prompt "banana sushi" -W512 -H768 used 7.10G peak VRAM. After the commit the same prompt used 7.16G. It's not a large difference, but it's enough to exhaust memory on 8G GPUs running 512x768 images. Presumably the system needs some VRAM too for its display. I've been hunting and there's nothing obvious happening. Indeed, none of the variation code gets run unless the -v or -F options are specified. So it's mysterious, but the only other set of changes that went in at that time were web server-related ones, which shouldn't have an effect. When you have a chance, could you see if you can find what I'm missing? Thx |
Hey guess what? The discussion in #364 pointed me at an attention.py optimization that reduces the peak VRAM usage of my test prompt from 4.4G to 3.6G. When generating a 512x768 image, it uses 5.30G whereas previously it was using 7.16G. I've pulled it into refactor-simplet2i if you want to stress test it. |
Wow. Max VRAM usage dropped from 7.xx GB to 5.3x GB at the moment. Any trade offs? However I've noticed that the lowered VRAM usage does not mean I can run larger resolutions. If I try to boot a larger image from the prompt, I get OOM'd anyway. Edit: I implemented just the attention code myself. I can now run 576x768 which is a jump from 512 to 768 but anything above that still OOM's. |
@tildebyte #375 should fix the original regression, independently of other optimizations; sorry about that. |
Describe your environment
Describe the bug
dream.py immediately OOMs when trying to generate anything larger than 512x512
To Reproduce
Steps to reproduce the behavior:
winpty python scripts/dream.py -Ak_euler_a
dream>
promptExpected behavior
No OOM
Additional context
Did an extensive manual bisect and found: 92d1ed7 - No OOMs, c78b496 - Always OOMs
The text was updated successfully, but these errors were encountered: