-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPS support for doggettx-optimizations #431
Comments
If someone knows how to get free VRAM memory on MPS devices, we just need to replace the torch.cuda calls. |
I Googled around, and there doesn't seem to be an equivalent set of memory interrogation calls for CPU. I'm not sure how the M1 works, but if it is sharing main memory (i.e. RAM) you might be able to get the needed metrics using psutil |
I just hacked it all out in my fork and set slice_size to 1 :-) that gets me doing 1024x1024 (very slowly) on a 8Gb M1 mini. |
It definitely works. I'll add the results below. In
and left steps at 1
And commented out
so there's probably an improvement to be made using Where I did use
but importing psutil
|
This is looking pretty encouraging. When you are satisfied with the performance on MPS, could you make your changes conditional on the device type so that CUDA systems will work as well? Then make a PR against the doggettx-optimizations branch. Think this might be done by tonight? I'm planning a development freeze, some testing, and then pulling into main over the weekend. |
That's the plan, yes, to make the changes conditional based on device type. |
Okay, changes are done. I'm doing the testing. |
@lstein Above are the files. Performance seems comparable to Regarding memory, I have to do more digging because while this afternoon I could generate 896x896 and 1024x768 (results I couldn't generate before), now at night I'm back to memory errors. In any case, this change should benefit CUDA users while allowing MPS devices to (apparently/presumably/hopefully) function at least as well as we currently do on the development branch |
In the end, still awake :) The common error I think all M1 users get of It takes a long time with steps=64, but testing around, it also works with steps=32, and even steps=4 (taking much less time). Pretty nice, and calls for some testing tomorrow PS: I'd just merge the 2 files above and leave this "finding" for a future PR. |
I can confirm that it's working and an improvement. Speedup was 2x over plain development branch, and now I'm testing larger image sizes... Environment: Development branch with the two files above swapped in. Machine: MBP 14", M1 Pro, 16GB, latest OS, running miniforge with a base of Python 3.10.6. Browser: Firefox 104.0.2. |
That's odd I've managed a very slow 1024x1024 from doggettx's optimaztions on my 8Gb M1 https://github.com/Vargol/stable-diffusion_m1_8gb Have you got an lot of other stuff running at the same time eating up Memory ? |
This is my memory usage right after booting the computer. Only the model loaded + 1 VS tab with the code
and I get
The In this discussion https://pullanswer.com/questions/mps-mpsndarray-error-product-of-dimension-sizes-2-31 they were saying that the problem was with Metal and that depending on the size/number of dimensions of the operation (e.g. So maybe you give it a smaller array and it fails but feed it a bigger array and chooses a different algorithm that doesn't have the |
For example, setting the steps as you do (with fixed value instead of calculating it), and With
Result: With
It works. Why, it's really not apparent to me. The problem with hard-setting the steps, though, is that, as the code progresses,
So, maybe there could be a point where it failed mid-execution, because we hard-set The solution I'm thinking is a mix of both techniques. Setting the steps dynamically (so it doesn't run out of memory), but also setting |
The 2**31 seems to be einsum trying to use a Tensor with more than 2,147,483,648 values as part of its calculation I remember have a similar issue when I simply set steps to 1 but allowed the slice_size calculation to go ahead slice_size = q.shape[1] // steps if (q.shape[1] % steps) == 0 else q.shape[1] which wasn't in an older cut of the code.. my code doing this , is that what you tried ?
And yes I appreciate that if people try even bigger images they may run out of memory but for me more steps just means slower renders and 1024x1024 is already 50 S / IT, as in n_sample=50, n_iter=1 takes 40 odd Minutes to generate an image . |
Looks like I was monitoring the wrong thread! I'll fold in these changes this morning and freeze development for testing. Thanks so much for this. |
@Vargol I can take Have you tried bigger I tried something very similar to your code, simply with larger For example, for We should be able to find a sweet spot, shouldn't we? |
For example, Hopefully there's some formula we can come up with for all M1 machines (8GB to 128GB) Update: |
So with my fixed value steps = slice_size, running 1 - 6 steps work, 6 steps are over 5x slower than 1 step
7 - 10 steps blow memory while sampling,
steps >= 11 fail with a oversized? buffer before sampling shows up in dreams.py
|
@Vargol Hmm I'll study your case too. |
Oh, this is interesting. So my computer can take Okay, so what But, I tried 10700 (one more) and it fails! I'm sure there is a formula to be found (including RAM), but at least we seem to be able to hack the max slice_size for our own devices, which is awesome! Update: So, I picked a random size. I wanted a 3200x1600 image. I used the formula and |
I'll study it a bit more, but the problem with Doggettx (besides 8192 vs 8191) is that sometimes it suggests an even larger |
@i3oc9i can you try your max slice_size for say 1024x1024 in your Mac with 128GB? We might be able to work out a formula including the RAM. Or someone else with a Mac different than 64 GB (which I have) |
@Any-Winter-4079 and fail, may be I'm missing somethibg ? whre is the new code to test ?
dream> "test" -s50 -W832 -H832 -C7.5 -Ak_lms -S12345678 (run but I get noise) |
Could someone with a Mac please run these lines?
That's the best way I know to detect if it's a Mac GPU, but I couldn't find what to check it against. Thanks! |
Basically the command |
I wouldn't adjust the slice_size because then it starts running incomplete parts of the whole array. It's best to increase the multiplier, which is probably too low then. So this part: mem_required = tensor_size * 2.5 Probably needs more than .5 extra, could try 2.6 or if you want to be safe just put it at 3, it'll just scale up the steps a bit earlier than needed. Which scales down the slice_size On a side note, it doesn't really have to step up in powers of 2, I just found that that was faster on average. You could change this part:
To something like
then it can run at any step or slice_size (even higher than 64, but you'll crash later then anyhow due to other parts running out of memory) |
Sorry, I had a safari window. Now 6.99gb and 1.30s/it. But 768 is very slow/doesn't work. |
As a new threshold, I propose maybe 3GB? |
@netsvetaev can you run |
It takes 30 at 35-37s/it. Hm. 768 gives an error, then goes for 150s/it.
|
Some advice is, every time you run a dream command and completes, exit with |
In any case, have you been able to run 1024x1024 with better results with some other code? |
No, my best was around 35-40, no differences with your code. Maybe 1-2s. |
You know, it might be because you have a bit less RAM available than the other person with 16GB. I can't find any other reason. Architecture? Pytorch version? |
MacOS 13 beta, I think. |
I'm running 12.5.1. No idea if there is any performance improvement/loss with the beta. I'd assume these results are more RAM-dependant than OS version-dependant, but who knows :) |
This is the version I'm planning on doing a PR with https://github.com/lstein/stable-diffusion/discussions/457#discussioncomment-3635644 If someone experiences a downgrade in performance vs. before, let me know. |
Getting better: |
Oh, so happy to read that! |
Sorry bu I dont see speed difference with this. |
Not for 64-128GB. The update is to give more speed for 16 -32GB Mac. |
Is it ok that I have better results on a main branch? Seems like it less ram hungry and also faster. 512 at 1.48 (yours is 1.35), 1024 at 29.7 with up to 4.2gb swap. |
…aboration incorporating a lot of people's contributions -- including for example @Doggettx and the original code from @neonsecret on which the Doggetx optimizations were based (see invoke-ai/InvokeAI#431, https://github.com/sd-webui/stable-diffusion-webui/pull/771\#issuecomment-1239716055). Takes exactly the same amount of time to run 8 steps as original CompVis code does (10.4 secs, ~1.25s/it).
I'm happy to add that on the latest macos 13 beta 1.14 main is got faster as 1.15s/it on 512px (58s total, always was 1:15-1:25), 8.4it/s on 768 (7:18, was 10-12 mins), and still 35s/it on 1024. UPD. After a fresh install I've got 1.38s/it 512, 6.20s/it 768 and 20.5s/it 1024. So it were mine problems. |
…aboration incorporating a lot of people's contributions -- including for example @Doggettx and the original code from @neonsecret on which the Doggetx optimizations were based (see invoke-ai/InvokeAI#431, https://github.com/sd-webui/stable-diffusion-webui/pull/771\#issuecomment-1239716055). Takes exactly the same amount of time to run 8 steps as original CompVis code does (10.4 secs, ~1.25s/it).
…aboration incorporating a lot of people's contributions -- including for example @Doggettx and the original code from @neonsecret on which the Doggetx optimizations were based (see invoke-ai/InvokeAI#431, https://github.com/sd-webui/stable-diffusion-webui/pull/771\#issuecomment-1239716055). Takes exactly the same amount of time to run 8 steps as original CompVis code does (10.4 secs, ~1.25s/it). (#1177) Co-authored-by: Alex Birch <birch-san@users.noreply.github.com>
* resolve conflict with master * - Added option to select custom models instead of just using the default one, if you want to use a custom model just place your .ckpt file in "models/custom" and the UI will detect it and let you switch between stable diffusion and your custom model, make sure to give the filename a proper name that is easy to distinguish from other models because that name will be used on the UI. - Implemented basic Text To Video tab, will continue to improve it as it is really basic right now. - Improved the model loading, you now should see less frequently errors about it not been loaded correctly. * fix: advanced editor (#827), close #811 refactor js_Call hook to take all gradio arguments * Added num_inference_steps to config file and fixed incorrectly calls to the config file from the txt2vid tab calling txt2img instead. * update readme as per installation step & format * proposed streamlit code organization changes I want people of all skill levels to be able to contribute This is one way the code could be split up with the aim of making it easy to understand and contribute especially for people on the lower end of the skill spectrum All i've done is split things, I think renaming and reorganising is still needed * Fixed missing diffusers dependency for Streamlit * Streamlit: Allow user defaults to be specified in a userconfig_streamlit.yaml file. * Changed Streamit yaml default configs Changed `update_preview_frequency` from every 1 step to every 5 steps. This results in a massive gain in performance (roughly going from 2-3 times slower to only 10-15% slower) while still showing good image generation output. Changed default GFPGAN and realESRGAN settings to be off by default. That way, users can decide if they want to use them on, and what images they wish to do so. * Made sure img2txt and img2img checkboxes respect YAML defaults * Move location of user file to configs/webui folder * Fixed the path in webui_streamlit.py * Display Info and Stats when render is complete, similar to what Gradio shows. * Add info and stats to img2img * chore: update maintenance scripts and docs (#891) * automate conda_env_name as per name in yaml * Embed installation links directly in README.md Include links to Windows, Linux, and Google Colab installations. * Fix conda update in webui.sh for pip bug * Add info about new PRs Co-authored-by: Hafiidz <3688500+Hafiidz@users.noreply.github.com> Co-authored-by: Tom Pham <54967380+TomPham97@users.noreply.github.com> Co-authored-by: GRMrGecko <grmrgecko@gmail.com> * Improvements to the txt2vid tab. * Urgent Fix to PR:860 * Update attention.py * Update FUNDING.yml * when in outcrop mode, mask added regions and fill in with voroni noise for better outpainting * frontend: display current device info (#889) Displays the current device info at the bottom of the page. For users who run multiple instances of `sd-webui` on the same system (for multiple GPUs), it helps to know which of the active `CUDA_VISIBLE_DEVICES` is being used. * Fixed aspect ratio box not being updated on txt2img tab, for issue 219 from old repo (#812) * Metadata cleanup - Maintain metadata within UI (#845) * Metadata cleanup - Maintain metadata within UI This commit, when combined with Gradio 3.2.1b1+, maintains image metadata as an image is passed throughout the UI. For example, if you generate an image, send it to Image Lab, upscale it, fix faces, and then drag the resulting image back in to Image Lab, it will still remember the image generation parameters. When the image is saved, the metadata will be stripped from it if save-metadata is not enabled. If the image is saved by *dragging* out of the UI on to the filesystem it may maintain its metadata. Note: I have ran into UI responsiveness issues with upgrading Gradio. Seems there may be some Gradio queue management issues. *Without* the gradio update this commit will maintain current functionality, but will not keep meetadata when dragging an image between UI components. * Move ImageMetadata into its own file Cleans up webui, enables webui_streamlit et al to use it as well. * Fix typo * Add filename formatting argument (#908) * Update webui.py Filename formatting argument * Update scripts/webui.py Co-authored-by: Thomas Mello <work.mello@gmail.com> * Tiling parameter (#911) * tiling * default to False * fix: filename format parameter (#923) * For issue :884, ensure webui.cmd before init src * Remove embeddings file path * Add mask_restore to restore images based on mask, fixing #665 (#898) * Add mask_restore option to give users the option to restore images based on mask, fixing #665. Before commit c73fdd7 (Implement masking during sampling to improve blending, #308) image mask was applied after sampling, resulting in masked parts that are not regenerated to actually stay the same. Since c73fdd7 the masked img2img will change the whole image, even in masked areas. It gives better looking results at first glance, but will result in image degredation when applied a few times. See issue #665. In the workflow of using repeated masked img2img, users may want to use this options to keep the parts of image they actually want to keep without image degradation. A final masked img2img or whole image img2img with mask_restore disabled will give the better blending of "Implement masking during sampling". * revert changes of a7be43b in change_image_editor_mode * fix ui_functions.change_image_editor_mode by adding gr.update to the end of the list it returns * revert inserted newlines and whitespaces to match format of previous code * improve caption of new option mask_restore "Only modify regenerated parts of image" * fix ui_functions.change_image_editor_mode by adding gr.update to the end of the list it returns an old copy of the function exists in webui.py, this superflous function mistakenly was changed by the earlier commit b6a9e16 * remove unused functions that are near duplicates of functions in ui_functions.py * Added CSS to center the image in the txt2img interface * add img2img option for color correction. (#936) color correction is already used for loopback to prevent color drift with the first image as correction target. the option allows to use the color correction even without loopback mode. it helps keeping the colors similar to the input image. * Image transparency is used as mask for inpainting * fix: lost imports from #921 * Changed StreamIt to `k_euler` 30 steps as default * Fixed an issue with the txt2vid model. * Removed old files from a split test we deed that are not needed anymore, we plan to do the split differently. * Changed the scheduler for the txt2vid tab back to LMS, for now we can only use that. * Better support for large batches in optimized mode * Removed some unused lines from the css file for the streamlit version. * Changed the diffusers version to be 0.2.4 or lower as a new version breaks the txt2vid generation. * Added the models/custom folder to gitignore to ignore custom models. * Added two new scripts that will be used for the new implementation of the txt2vid tab which uses the latest version of the diffusers library. * - Improved the progress bar for the txt2vid tab, it now shows more information during generation. - Changed the guidance_scale variable to be cfg_scale. * Perform masked image restoration for GFPGAN, RealESRGAN, fixing #947 * Perform masked image restoration when using GFPGAN or RealESRGAN, fixing #947. Also fixes bug in image display when using masked image restoration with RealESRGAN. When the image is upscaled using RealESRGAN the image restoration can not use the original image because it has wrong resolution. In this case the image restoration will restore the non-regenerated parts of the image with an RealESRGAN upscaled version of the original input image. Modifications from GFPGAN or color correction in (un)masked parts are also restored to the original image by mask blending. * Update scripts/webui.py Co-authored-by: Thomas Mello <work.mello@gmail.com> * fix: sampler name in GoBig #988 * add sampler_name defaults to img2img * add metadata to other file output file types * remove deprecated kwargs/parameter * refactor: sort out dependencies Co-Authored-By: oc013 <101832295+oc013@users.noreply.github.com> Co-Authored-By: Aarni Koskela <akx@iki.fi> Co-Authored-By: oc013 <101832295+oc013@users.noreply.github.com> Co-Authored-By: Aarni Koskela <akx@iki.fi> * webui: detect scoped-down GPU environment (#993) * webui: detect scoped-down GPU environment check if we're using a scoped-down GPU environment (pynvml does not listen to CUDA_VISIBLE_DEVICES) so that we can measure memory on the correct GPU * remove unnecessary import * Added piexif dependency. * Changed the minimum value for the Sampling Steps and Inference Steps to 10 and added step with a value of 10 to make it easier to move the slider as it will require a higher maximum value than in other tabs for good results on the text2vid tab. * Commented an import that is not used for now but will be used soon. * write same metadata to file and yaml * include piexif in environment needed for exif labelling of non-png files * fix individual image file format saves * introduces a general config setting save_format similar to grid_format for individual file saves * Add NSFW filter to avoid unexpected (#955) * Add NSFW filter to avoid unexpected * Fix img2img configuration numbering * Added some basic layout for the Model Manager tab and added there the models that most people use to make it easy to download instead of having to go do the wiki or searching through discord for links, it also shows the path where you are supposed to put those models for them to work. * webui: display the GPU in use during startup (#994) * webui: display the GPU in use during startup tell the user which GPU the code is actually going to use before spending lots of time loading everything onto the GPU * typo * add some info messages * evaluate current GPU properly * add debug flag gating not everyone wants or needs to see debug messages :) * add in stray debug msg * Docker updates - Add LDSR, streamlit, other updates for new repository * Update util.py * Docker - Set PYTHONPATH to parent directory to avoid `No module named frontend` error * Add missing comma for nsfw toggle in img2img (#1028) * Multiple improvements to the txt2vid tab. - Improved txt2vid speed by 2 times. - Added DDIM scheduler. - Added sliders for beta_start and beta_end to have more control over these parameters on the scheduler. - Added option to select the scheduler type from scaled_linear or linear. - Added option to save info files for the txt2vid tab and improved the information saved to include most of the parameters used to run the generation. - You can now download any model from the huggingface website to use on the txt2vid tab, just add the name to the custom_models_list on the config file. * webui: add prompt output to console (#1031) * webui: add prompt output to console show the user what prompt is currently being rendered * fix prompt print location * support negative prompts separated by ### e.g. "shopping mall ### people" will try to generate an image of a mall without people in it. * Docker validate model files if not a symlink in case user has VALIDATE_MODELS=false set (#1038) * - Added changes made by @hafiidz on the ui-improvements branch to the css for the streamli-on-hover-tabs component. * Added streamlit-on-Hover-tabs and streamlit-option-menu dependencies to the environment.yaml file. * Changed some values to be dynamic instead of a fixed value so they are more responsive. * Changed the cmd script to use the dark theme by default when launching the streamlit UI. * Removed the padding at the top of the sidebar so we can have more free space. * - Added code for @hafiidz's changes made on the css. * Fixed an error with the metadata not able to be saved because of the seed was not converted to a string before so it had no attribute encode on it. * add masking to streamlit img2img, find_noise_for_image, matched_noise * Use the webui script directories as PWD (#946) * add Gradio API endpoint settings (#1055) * add Gradio API endpoint settings * Add comments crediting code authors. (probably not enough, but better than none) * Renamed the save_grid option for txt2vid on the config file to be save_video, this will be used to determine if the user wants to save a video at the end of the generation or not, similar to the save_grid that is used on txt2img and img2img but for video. * -Added the Update Image Preview option to be part of the current tab options under Preview Settings. - Added Dynamic Preview Frequency option for the txt2vid tab which tries to find the lowest value for update_preview_frequency at which we can update the preview image during generation while at the same time minimizing the impact it has in performance. - Added option to save a video file on the outputs/txt2vid-samples folder after the generation is complete similar to how the save_grid option works on other tabs. - Added a video preview which shows a video on the txt2vid tab when the generation is completed. - Formated some lines of code to make it use less space and fit on the a single screen. - Added a script called Settings.py to the script folder in which Settings for the Setting page will be placed. Empty for now. * Commented some print statements that were used for debugging and forgot to remove previously. * fix: disable live prompt parsing * Fix issue where loopback was using batch mode * Fix indentation error that prevents mask_restore from working unless ESRGAN is turned on * Fixed Sidebar CSS for 4K displays * img2img mask fixes and fix image2noise normalization * Made it so the sampling_steps is added to num_inference_steps, otherwise it would not match the value you set for it on the slider. * Changed the loading of the model on the txt2vid tab so the half models are only loaded if the no_half option on the config file is set to False. * fix: launcher batch file fix #920, fix #605 - Allow reading environment.yaml file in either LF or CRLF - Only update environment if environment.yaml changes - Remove custom_conda_path to discourage changing source file - Fix unable to launch webui due to frontend module missing (#605) * Update README.md (#1075) fix typo * half precision streamlit txt2vid `RuntimeError: expected scalar type Half but found Float` with both `torch_dtype=torch.float16` and `revision="fp16"` * Add mask restore feature to streamlit, prevent color correction from modifying initial image when mask_restore is turned on * Add mask_restore to streamlit config * JobManager: Fix typo breaking jobs close #858 close #1041 * JobManager: Buttons skip queue (#1092) Have JobManager buttons skip Gradio's queue, since otherwise they aren't sending JobManager button presses. * The webui_streamlit.py file has been split into multiple modules containing their own code making it easier to work with than a single big file. The list of modules is as follow: - webuit_streamlit.py: contains the main layout as well as the functions that load the css which is needed by the layout. - webui_streamlit_old.py: contains the code for the previous version of the WebUI. Will be removed once the new UI code starts to get used and if everything works as it should. - txt2img.py: contains the code for the txt2img tab. - img2img.py: contains the code for the img2img tab. - txt2vid.py: contains the code for the txt2vid tab. - sd_utils.py: contains utility functions used by more than one module, any function that meets such condition should be placed here. - ModelManager.py: contains the code for the Model Manager page on the sidebar menu. - Settings.py: contains the code for the Settings page on the sidebar menu. - home.py: contains the code for the Home tab, history and gallery implemented by @devilismyfriend. - imglab.py: contains the code for the Image Lab tab implemented by @devilismyfriend * fix: patch docker conda install pip requirements (#1094) (cherry picked from commit fab5765) Co-authored-by: Sérgio <smaisidoro@gmail.com> * Using the Optimization from Dogettx (#974) * Update attention.py * change to dogettx Co-authored-by: hlky <106811348+hlky@users.noreply.github.com> * Update Dockerfile (#1101) add expose for streamlit port * Publish Streamlit ports (#1102) (cherry picked from commit 833a910) Co-authored-by: Charlie <outlookhazy@users.noreply.github.com> * Forgot to call the layout() for the Model Manager tab after the import so it was not been used and the tab was shown as empty. * Removed the "find_noise_for_image.py" and "matched_noise.py" scripts as their content is now part of "sd_utils.py" * - Added the functions to load the optimized models, this "should" make it so optimized and turbo mode work now but needs to be tested more. - Added function to load LDSR. * Fixed some imports. * Fixed the info message on the txt2img tab not showing the info but just showing the text "Done" * Made the defaults settings from the config file be stored inside st.session_state to avoid loading it multiple times when calling the "sd_utils.py" file from other modules. * Moved defaults to the webui_streamlit.py file and fixed some imports. * Removed condition to check if the defaults are in the st.session_state dictionary, this is not needed and would cause issues with it not being reloaded when the user changes something on it. * Modified the way the defaults settings are loaded from the config file so we only load them on the webui_strealit.py file and use st.session_state to access them from anywhere else, this makes it so the config can be modified externally like before the code split and the changes will be updated on next rerun of the UI. * fix: [streamlit] optimization mode * temp disable nvml support for multiple gpus * Fixed defaults not being loaded correctly or missing in some places. * Add a separate update script instead of git pull on startup (#1106) * - Fixed max_frame not being properly used and instead sampling_steps was the variable being use. - Fixed several issues with wrong variable being used on multiple places. - Addd option to toggle some extra option from the config file for when the model is loading on the txt2vid tab. * Re-merge #611 - View/Cancel in-progress diffusions (#796) * JobManager: Re-merge #611 PR #611 seems to have got lost in the shuffle after the transition to 'dev'. This commit re-merges the feature branch. This adds support for viewing preview images as the image generates, as well as cancelling in-progress images and a couple fixes and clean-ups. * JobManager: Clear jobs that fail to start Sometimes if a job fails to start it will get stuck in the active job list. This commit ensures that jobs that raise exceptions are cleared, and also adds a start timer to clear out jobs that fail to start within a reasonable amount of time. * chore: add breaks to cmds for readability (#1134) * Added custom models list to the txt2img tab. * Small fix to the custom model list. * Corrected breaking issues introduced in #1136 to txt2img and made state variables consistent with img2img. Fixed a bug where switching models after running would not reload the used model. * Formatted tabs as spaces * Fixed update_preview_frequency and update_preview using defaults from webui_streamlit.yaml instead of state variables from UI. * Prompt user if they want to restore changes (#1137) - After stashing any changes and pulling updates, ask user if they wish to pop changes - If user declines the restore, drop the stash to prevent the case of an ever growing stash pile * Added streamlit_nested_layout component as dependency and imported on the webui_streamli.py file to allow us to use nested columns and expanders. * - Added the Home tab made by @devilismyfriend - Added gallery tab on txt2img. * Added case insensitivity to restore prompt (#1152) * Calculate aspect ratio and pixel count on start (#1157) * Fix errors rendering galleries when there are not enough images to render * Fix the gallery back/next buttons and add a refresh button * Fix invalid invocation of find_noise_for_image * Removed the Home tab until the gallery is fixed. * Fixed a missing import on the ModelManager script. * Added discord server link to the Readme.md * - Increased the max value for the width and height sliders on the txt2img tab. - Fixed a leftover line from removing the home tab. * Update conda environment on startup always (#1171) * Update environment on startup always * Message to explicitly state no environment.yaml update required Co-authored-by: hlky <106811348+hlky@users.noreply.github.com> * environment update from .cmd * Update .gitignore * Enable negative prompts on streamlit * - Bumped the version of diffusers used on the txt2vid tab to be now v0.3.0. - Added initial file for the textual inversion tab. * add missing argument to GoBig sample function, fixes #1183 (#1184) * cherry-pick @Any-Winter-4079's invoke-ai/InvokeAI#540. this is a collaboration incorporating a lot of people's contributions -- including for example @Doggettx and the original code from @neonsecret on which the Doggetx optimizations were based (see invoke-ai/InvokeAI#431, https://github.com/sd-webui/stable-diffusion-webui/pull/771\#issuecomment-1239716055). Takes exactly the same amount of time to run 8 steps as original CompVis code does (10.4 secs, ~1.25s/it). (#1177) Co-authored-by: Alex Birch <birch-san@users.noreply.github.com> * allow webp uploads to img2img tab #991 * Don't attempt mask restoration when there is no mask given (#1186) * When running a batch with preview turned on, produce a grid of preview images * When early terminating, generation_callback gets invoked but st.session_state is empty. When this happens, just bail. * Collect images for final display This is a collection of several changes to enhance image display: * When using GFPGAN or RealESRGAN, only the final output will be displayed. * In batch>1 mode, each final image will be collected into an image grid for display * The image is constrained to a reasonable size to ensure that batch grids of RealESRGAN'd images don't end up spitting out a massive image that the browser then has to handle. * Additionally, the progress bar indicator is updated as each image is post-processed. * Display the final image before running postprocessing, and don't preview when i=0 * Added a config option to use embeddings from the huggingface stable diffusion concept library. * Added option to enable enable_attention_slicing and enable_minimal_memory_usage, this for now only works on txt2vid which uses diffusers. * Basic implementation for the Concept Library tab made by cloning the Home tab. * Temporarily hide sd_concept_library due to missing layout() * st.session_state["defaults"] fix * Used loaded_model state variable in .yaml generation (#1196) Used loaded_model state variable in .yaml generation * Streamlit txt2img page settings now follow defaults (#1195) * Some options on the Streamlit txt2img page now follow the defaults from the relevant config files. * Fixed a copy-paste gone wrong in my previous commit. * st.session_state["defaults"] fix Co-authored-by: hlky <106811348+hlky@users.noreply.github.com> * default img2img denoising strength increased * slider_steps and slider_bounds in defaults config slider_steps and slider_bounds in defaults config * fix: copy to clipboard button Co-authored-by: ZeroCool940711 <alejandrogilelias940711@gmail.com> Co-authored-by: ZeroCool <ZeroCool940711@users.noreply.github.com> Co-authored-by: Hafiidz <3688500+Hafiidz@users.noreply.github.com> Co-authored-by: hlky <106811348+hlky@users.noreply.github.com> Co-authored-by: Joshua Kimsey <jkimsey95@gmail.com> Co-authored-by: Tony Beeman <beeman@gmail.com> Co-authored-by: Tom Pham <54967380+TomPham97@users.noreply.github.com> Co-authored-by: GRMrGecko <grmrgecko@gmail.com> Co-authored-by: TingTingin <36141041+TingTingin@users.noreply.github.com> Co-authored-by: Logan zoellner <nagolinc@gmail.com> Co-authored-by: M <mchaker@users.noreply.github.com> Co-authored-by: James Pound <jamespoundiv@gmail.com> Co-authored-by: cobryan05 <13701027+cobryan05@users.noreply.github.com> Co-authored-by: Michoko <michoko@hotmail.com> Co-authored-by: VulumeCode <2590984+VulumeCode@users.noreply.github.com> Co-authored-by: xaedes <xaedes@googlemail.com> Co-authored-by: Michael Hearn <git@mikehearn.net> Co-authored-by: Soul-Burn <sugoibaka@gmail.com> Co-authored-by: JJ <jjisnow@gmail.com> Co-authored-by: oc013 <101832295+oc013@users.noreply.github.com> Co-authored-by: Aarni Koskela <akx@iki.fi> Co-authored-by: osi1880vr <87379616+osi1880vr@users.noreply.github.com> Co-authored-by: Rae Fu <rraefu@gmail.com> Co-authored-by: Brian Semrau <brian.semrau@gmail.com> Co-authored-by: Matt Soucy <git@msoucy.me> Co-authored-by: endomorphosis <endomorphosis@users.noreply.github.com> Co-authored-by: unnamedplugins <79282950+unnamedplugins@users.noreply.github.com> Co-authored-by: Syahmi Azhar <prsyahmi@gmail.com> Co-authored-by: Ahmad Abdullah <83442967+ahmad1284@users.noreply.github.com> Co-authored-by: Sérgio <smaisidoro@gmail.com> Co-authored-by: Charlie <outlookhazy@users.noreply.github.com> Co-authored-by: protoplm <protoplmz@gmail.com> Co-authored-by: Ascended <dspradau@gmail.com> Co-authored-by: JuanLagu <32816584+JuanLagu@users.noreply.github.com> Co-authored-by: Chris Heald <cheald@gmail.com> Co-authored-by: Charles Galant <cgalant@gmail.com> Co-authored-by: Alex Birch <birch-san@users.noreply.github.com> Co-authored-by: protoplm <57930981+protoplm@users.noreply.github.com> Co-authored-by: Dekker3D <dekker3d@gmail.com>
On my HPC node, I also see a remarkable variation in VRAM usage in
the doggettx branch as I make small adjustments to image size. Fortunately,
it is pretty stable at lower image sizes where most people will be working.
|
Performance improvements to generate larger images in M1 invoke-ai#431 Update attention.py Added dtype=r1.dtype to softmax
commit 1c649e4 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 12 13:29:16 2022 -0400 fix torchvision dependency version #511 commit 4d197f6 Merge: a3e07fb 190ba78 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 12 07:29:19 2022 -0400 Merge branch 'development' of github.com:lstein/stable-diffusion into development commit a3e07fb Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 12 07:28:58 2022 -0400 fix grid crash commit 9fa1f31 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 12 07:07:05 2022 -0400 fix opencv and realesrgan dependencies in mac install commit 190ba78 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 12 01:50:58 2022 -0400 Update requirements-mac.txt Fixed dangling dash on last line. commit 25d9ccc Author: Any-Winter-4079 <50542132+Any-Winter-4079@users.noreply.github.com> Date: Mon Sep 12 03:17:29 2022 +0200 Update model.py commit 9cdf3ac Author: Any-Winter-4079 <50542132+Any-Winter-4079@users.noreply.github.com> Date: Mon Sep 12 02:52:36 2022 +0200 Update attention.py Performance improvements to generate larger images in M1 invoke-ai#431 Update attention.py Added dtype=r1.dtype to softmax commit 49a96b9 Author: Mihai <299015+mh-dm@users.noreply.github.com> Date: Sat Sep 10 16:58:07 2022 +0300 ~7% speedup (1.57 to 1.69it/s) from switch to += in ldm.modules.attention. (invoke-ai#482) Tested on 8GB eGPU nvidia setup so YMMV. 512x512 output, max VRAM stays same. commit aba94b8 Author: Niek van der Maas <mail@niekvandermaas.nl> Date: Fri Sep 9 15:01:37 2022 +0200 Fix macOS `pyenv` instructions, add code block highlight (invoke-ai#441) Fix: `anaconda3-latest` does not work, specify the correct virtualenv, add missing init. commit aac5102 Author: Henry van Megen <h.vanmegen@gmail.com> Date: Thu Sep 8 05:16:35 2022 +0200 Disabled debug output (invoke-ai#436) Co-authored-by: Henry van Megen <hvanmegen@gmail.com> commit 0ab5a36 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 17:19:46 2022 -0400 fix missing lines in outputs commit 5e43372 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 16:20:14 2022 -0400 upped max_steps in v1-finetune.yaml and fixed TI docs to address invoke-ai#493 commit 7708f4f Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 16:03:37 2022 -0400 slight efficiency gain by using += in attention.py commit b86a1de Author: blessedcoolant <54517381+blessedcoolant@users.noreply.github.com> Date: Mon Sep 12 07:47:12 2022 +1200 Remove print statement styling (invoke-ai#504) Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com> commit 4951e66 Author: chromaticist <mhostick@gmail.com> Date: Sun Sep 11 12:44:26 2022 -0700 Adding support for .bin files from huggingface concepts (invoke-ai#498) * Adding support for .bin files from huggingface concepts * Updating documentation to include huggingface .bin info commit 79b445b Merge: a323070 f7662c1 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 15:39:38 2022 -0400 Merge branch 'development' of github.com:lstein/stable-diffusion into development commit a323070 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 15:28:57 2022 -0400 update requirements for new location of gfpgan commit f7662c1 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 15:00:24 2022 -0400 update requirements for changed location of gfpgan commit 93c242c Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 14:47:58 2022 -0400 make gfpgan_model_exists flag available to web interface commit c7c6cd7 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 14:43:07 2022 -0400 Update UPSCALE.md New instructions needed to accommodate fact that the ESRGAN and GFPGAN packages are now installed by environment.yaml. commit 77ca83e Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 14:31:56 2022 -0400 Update CLI.md Final documentation tweak. commit 0ea145d Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 14:29:26 2022 -0400 Update CLI.md More doc fixes. commit 162285a Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 14:28:45 2022 -0400 Update CLI.md Minor documentation fix commit 37c921d Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 14:26:41 2022 -0400 documentation enhancements commit 4f72cb4 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 13:05:38 2022 -0400 moved the notebook files into their own directory commit 878ef2e Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 12:58:06 2022 -0400 documentation tweaks commit 4923118 Merge: 16f6a67 defafc0 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 12:51:25 2022 -0400 Merge branch 'development' of github.com:lstein/stable-diffusion into development commit defafc0 Author: Dominic Letz <dominic@diode.io> Date: Sun Sep 11 18:51:01 2022 +0200 Enable upscaling on m1 (invoke-ai#474) commit 16f6a67 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 12:47:26 2022 -0400 install GFPGAN inside SD repository in order to fix 'dark cast' issue invoke-ai#169 commit 0881d42 Author: blessedcoolant <54517381+blessedcoolant@users.noreply.github.com> Date: Mon Sep 12 03:52:43 2022 +1200 Docs Update (invoke-ai#466) Authored-by: @blessedcoolant Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com> commit 9a29d44 Author: Gérald LONLAS <gerald@lonlas.com> Date: Sun Sep 11 23:23:18 2022 +0800 Revert "Add 3x Upscale option on the Web UI (invoke-ai#442)" (invoke-ai#488) This reverts commit f8a5408. commit d301836 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 10:52:19 2022 -0400 can select prior output for init_img using -1, -2, etc commit 70aa674 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 10:34:06 2022 -0400 merge PR invoke-ai#495 - keep using float16 in ldm.modules.attention commit 8748370 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 10:22:32 2022 -0400 negative -S indexing recovers correct previous seed; closes issue invoke-ai#476 commit 839e30e Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 10:02:44 2022 -0400 improve CUDA VRAM monitoring extra check that device==cuda before getting VRAM stats commit bfb2781 Author: tildebyte <337875+tildebyte@users.noreply.github.com> Date: Sat Sep 10 10:15:56 2022 -0400 fix(readme): add note about updating env via conda (invoke-ai#475) commit 5c43988 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 10 10:02:43 2022 -0400 reduce VRAM memory usage by half during model loading * This moves the call to half() before model.to(device) to avoid GPU copy of full model. Improves speed and reduces memory usage dramatically * This fix contributed by @mh-dm (Mihai) commit 9912270 Merge: 817c4a2 ecc6b75 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 10 09:54:34 2022 -0400 Merge branch 'development' of github.com:lstein/stable-diffusion into development commit 817c4a2 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 10 09:53:27 2022 -0400 remove -F option from normalized prompt; closes invoke-ai#483 commit ecc6b75 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 10 09:53:27 2022 -0400 remove -F option from normalized prompt commit 723d074 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Fri Sep 9 18:49:51 2022 -0400 Allow ctrl c when using --from_file (invoke-ai#472) * added ansi escapes to highlight key parts of CLI session * adjust exception handling so that ^C will abort when reading prompts from a file commit 75f633c Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Fri Sep 9 12:03:45 2022 -0400 re-add new logo commit 10db192 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Fri Sep 9 09:26:10 2022 -0400 changes to dogettx optimizations to run on m1 * Author @Any-Winter-4079 * Author @dogettx Thanks to many individuals who contributed time and hardware to benchmarking and debugging these changes. commit c85ae00 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 23:57:45 2022 -0400 fix bug which caused seed to get "stuck" on previous image even when UI specified -1 commit 1b5aae3 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 22:36:47 2022 -0400 add icon to dream web server commit 6abf739 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 22:25:09 2022 -0400 add favicon to web server commit db825b8 Merge: 33874ba afee7f9 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 22:17:37 2022 -0400 Merge branch 'deNULL-development' into development commit 33874ba Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 22:16:29 2022 -0400 Squashed commit of the following: commit afee7f9 Merge: 6531446 171f8db Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 22:14:32 2022 -0400 Merge branch 'development' of github.com:deNULL/stable-diffusion into deNULL-development commit 171f8db Author: Denis Olshin <me@denull.ru> Date: Thu Sep 8 03:15:20 2022 +0300 saving full prompt to metadata when using web ui commit d7e67b6 Author: Denis Olshin <me@denull.ru> Date: Thu Sep 8 01:51:47 2022 +0300 better logic for clicking to make variations commit afee7f9 Merge: 6531446 171f8db Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 22:14:32 2022 -0400 Merge branch 'development' of github.com:deNULL/stable-diffusion into deNULL-development commit 6531446 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 20:41:37 2022 -0400 work around unexplained crash when timesteps=1000 (invoke-ai#440) * work around unexplained crash when timesteps=1000 * this fix seems to work commit c33a84c Author: blessedcoolant <54517381+blessedcoolant@users.noreply.github.com> Date: Fri Sep 9 12:39:51 2022 +1200 Add New Logo (invoke-ai#454) * Add instructions on how to install alongside pyenv (invoke-ai#393) Like probably many others, I have a lot of different virtualenvs, one for each project. Most of them are handled by `pyenv`. After installing according to these instructions I had issues with ´pyenv`and `miniconda` fighting over the $PATH of my system. But then I stumbled upon this nice solution on SO: https://stackoverflow.com/a/73139031 , upon which I have based my suggested changes. It runs perfectly on my M1 setup, with the anaconda setup as a virtual environment handled by pyenv. Feel free to incorporate these instructions as you see fit. Thanks a million for all your hard work. * Disabled debug output (invoke-ai#436) Co-authored-by: Henry van Megen <hvanmegen@gmail.com> * Add New Logo Co-authored-by: Håvard Gulldahl <havard@lurtgjort.no> Co-authored-by: Henry van Megen <h.vanmegen@gmail.com> Co-authored-by: Henry van Megen <hvanmegen@gmail.com> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com> commit f8a5408 Author: Gérald LONLAS <gerald@lonlas.com> Date: Fri Sep 9 01:45:54 2022 +0800 Add 3x Upscale option on the Web UI (invoke-ai#442) commit 244239e Author: James Reynolds <magnusviri@users.noreply.github.com> Date: Thu Sep 8 05:36:33 2022 -0600 macOS CI workflow, dream.py exits with an error, but the workflow com… (invoke-ai#396) * macOS CI workflow, dream.py exits with an error, but the workflow completes. * Files for testing Co-authored-by: James Reynolds <magnsuviri@me.com> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com> commit 711d49e Author: James Reynolds <magnusviri@users.noreply.github.com> Date: Thu Sep 8 05:35:08 2022 -0600 Cache model workflow (invoke-ai#394) * Add workflow that caches the model, step 1 for CI * Change name of workflow job Co-authored-by: James Reynolds <magnsuviri@me.com> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com> commit 7996a30 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 07:34:03 2022 -0400 add auto-creation of mask for inpainting (invoke-ai#438) * now use a single init image for both image and mask * turn on debugging for now to write out mask and image * add back -M option as a fallback commit a69ca31 Author: elliotsayes <elliotsayes@gmail.com> Date: Thu Sep 8 15:30:06 2022 +1200 .gitignore WebUI temp files (invoke-ai#430) * Add instructions on how to install alongside pyenv (invoke-ai#393) Like probably many others, I have a lot of different virtualenvs, one for each project. Most of them are handled by `pyenv`. After installing according to these instructions I had issues with ´pyenv`and `miniconda` fighting over the $PATH of my system. But then I stumbled upon this nice solution on SO: https://stackoverflow.com/a/73139031 , upon which I have based my suggested changes. It runs perfectly on my M1 setup, with the anaconda setup as a virtual environment handled by pyenv. Feel free to incorporate these instructions as you see fit. Thanks a million for all your hard work. * .gitignore WebUI temp files Co-authored-by: Håvard Gulldahl <havard@lurtgjort.no> commit 5c6b612 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Wed Sep 7 22:50:55 2022 -0400 fix bug that caused same seed to be redisplayed repeatedly commit 56f155c Author: Johan Roxendal <johan@roxendal.com> Date: Thu Sep 8 04:50:06 2022 +0200 added support for parsing run log and displaying images in the frontend init state (invoke-ai#410) Co-authored-by: Johan Roxendal <johan.roxendal@litteraturbanken.se> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com> commit 4168774 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Wed Sep 7 20:24:35 2022 -0400 added missing initialization of latent_noise to None commit 171f8db Author: Denis Olshin <me@denull.ru> Date: Thu Sep 8 03:15:20 2022 +0300 saving full prompt to metadata when using web ui commit d7e67b6 Author: Denis Olshin <me@denull.ru> Date: Thu Sep 8 01:51:47 2022 +0300 better logic for clicking to make variations commit d1d044a Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Wed Sep 7 17:56:59 2022 -0400 actual image seed now written into web log rather than -1 (invoke-ai#428) commit edada04 Author: Arturo Mendivil <60411196+artmen1516@users.noreply.github.com> Date: Wed Sep 7 10:42:26 2022 -0700 Improve notebook and add requirements file (invoke-ai#422) commit 29ab3c2 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Wed Sep 7 13:28:11 2022 -0400 disable neonpixel optimizations on M1 hardware (invoke-ai#414) * disable neonpixel optimizations on M1 hardware * fix typo that was causing random noise images on m1 commit 7670ecc Author: cody <cnmizell@gmail.com> Date: Wed Sep 7 12:24:41 2022 -0500 add more keyboard support on the web server (invoke-ai#391) add ability to submit prompts with the "enter" key add ability to cancel generations with the "escape" key commit dd2aeda Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Wed Sep 7 13:23:53 2022 -0400 report VRAM usage stats during initial model loading (invoke-ai#419) commit f628477 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Tue Sep 6 17:12:39 2022 -0400 Squashed commit of the following: commit 7d1344282d942a33dcecda4d5144fc154ec82915 Merge: caf4ea3 ebeb556 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 5 10:07:27 2022 -0400 Merge branch 'development' of github.com:WebDev9000/stable-diffusion into WebDev9000-development commit ebeb556 Author: Web Dev 9000 <rirath@gmail.com> Date: Sun Sep 4 18:05:15 2022 -0700 Fixed unintentionally removed lines commit ff2c4b9 Author: Web Dev 9000 <rirath@gmail.com> Date: Sun Sep 4 17:50:13 2022 -0700 Add ability to recreate variations via image click commit c012929 Author: Web Dev 9000 <rirath@gmail.com> Date: Sun Sep 4 14:35:33 2022 -0700 Add files via upload commit 02a6018 Author: Web Dev 9000 <rirath@gmail.com> Date: Sun Sep 4 14:35:07 2022 -0700 Add files via upload commit eef7889 Author: Olivier Louvignes <olivier@mg-crea.com> Date: Tue Sep 6 12:41:08 2022 +0200 feat(txt2img): allow from_file to work with len(lines) < batch_size (invoke-ai#349) commit 720e5cd Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 5 20:40:10 2022 -0400 Refactoring simplet2i (invoke-ai#387) * start refactoring -not yet functional * first phase of refactor done - not sure weighted prompts working * Second phase of refactoring. Everything mostly working. * The refactoring has moved all the hard-core inference work into ldm.dream.generator.*, where there are submodules for txt2img and img2img. inpaint will go in there as well. * Some additional refactoring will be done soon, but relatively minor work. * fix -save_orig flag to actually work * add @neonsecret attention.py memory optimization * remove unneeded imports * move token logging into conditioning.py * add placeholder version of inpaint; porting in progress * fix crash in img2img * inpainting working; not tested on variations * fix crashes in img2img * ported attention.py memory optimization invoke-ai#117 from basujindal branch * added @torch_no_grad() decorators to img2img, txt2img, inpaint closures * Final commit prior to PR against development * fixup crash when generating intermediate images in web UI * rename ldm.simplet2i to ldm.generate * add backward-compatibility simplet2i shell with deprecation warning * add back in mps exception, addresses @Vargol comment in #354 * replaced Conditioning class with exported functions * fix wrong type of with_variations attribute during intialization * changed "image_iterator()" to "get_make_image()" * raise NotImplementedError for calling get_make_image() in parent class * Update ldm/generate.py better error message Co-authored-by: Kevin Gibbons <bakkot@gmail.com> * minor stylistic fixes and assertion checks from code review * moved get_noise() method into img2img class * break get_noise() into two methods, one for txt2img and the other for img2img * inpainting works on non-square images now * make get_noise() an abstract method in base class * much improved inpainting Co-authored-by: Kevin Gibbons <bakkot@gmail.com> commit 1ad2a8e Author: thealanle <35761977+thealanle@users.noreply.github.com> Date: Mon Sep 5 17:35:04 2022 -0700 Fix --outdir function for web (invoke-ai#373) * Fix --outdir function for web * Removed unnecessary hardcoded path commit 52d8bb2 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 5 10:31:59 2022 -0400 Squashed commit of the following: commit 0cd48e932f1326e000c46f4140f98697eb9bdc79 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 5 10:27:43 2022 -0400 resolve conflicts with development commit d7bc8c1 Author: Scott McMillin <scott@scottmcmillin.com> Date: Sun Sep 4 18:52:09 2022 -0500 Add title attribute back to img tag commit 5397c89 Author: Scott McMillin <scott@scottmcmillin.com> Date: Sun Sep 4 13:49:46 2022 -0500 Remove temp code commit 1da080b Author: Scott McMillin <scott@scottmcmillin.com> Date: Sun Sep 4 13:33:56 2022 -0500 Cleaned up HTML; small style changes; image click opens image; add seed to figcaption beneath image commit caf4ea3 Author: Adam Rice <adam@askadam.io> Date: Mon Sep 5 10:05:39 2022 -0400 Add a 'Remove Image' button to clear the file upload field (invoke-ai#382) * added "remove image" button * styled a new "remove image" button * Update index.js commit 95c088b Author: Kevin Gibbons <bakkot@gmail.com> Date: Sun Sep 4 19:04:14 2022 -0700 Revert "Add CORS headers to dream server to ease integration with third-party web interfaces" (invoke-ai#371) This reverts commit 91e826e. commit a20113d Author: Kevin Gibbons <bakkot@gmail.com> Date: Sun Sep 4 18:59:12 2022 -0700 put no_grad decorator on make_image closures (invoke-ai#375) commit 0f93dad Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 4 21:39:15 2022 -0400 fix several dangling references to --gfpgan option, which no longer exists commit f4004f6 Author: tildebyte <337875+tildebyte@users.noreply.github.com> Date: Sun Sep 4 19:43:04 2022 -0400 TOIL(requirements): Split requirements to per-platform (invoke-ai#355) * toil(reqs): split requirements to per-platform Signed-off-by: Ben Alkov <ben.alkov@gmail.com> * toil(reqs): fix for Win and Lin... ...allow pip to resolve latest torch, numpy Signed-off-by: Ben Alkov <ben.alkov@gmail.com> * toil(install): update reqs in Win install notebook Signed-off-by: Ben Alkov <ben.alkov@gmail.com> Signed-off-by: Ben Alkov <ben.alkov@gmail.com> commit 4406fd1 Merge: 5116c81 fd7a72e Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 4 08:23:53 2022 -0400 Merge branch 'SebastianAigner-main' into development Add support for full CORS headers for dream server. commit fd7a72e Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 4 08:23:11 2022 -0400 remove debugging message commit 3a2be62 Merge: 91e826e 5116c81 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 4 08:15:51 2022 -0400 Merge branch 'development' into main commit 5116c81 Author: Justin Wong <1584142+wongjustin99@users.noreply.github.com> Date: Sun Sep 4 07:17:58 2022 -0400 fix save_original flag saving to the same filename (invoke-ai#360) * Update README.md with new Anaconda install steps (invoke-ai#347) pip3 version did not work for me and this is the recommended way to install Anaconda now it seems * fix save_original flag saving to the same filename Before this, the `--save_orig` flag was not working. The upscaled/GFPGAN would overwrite the original output image. Co-authored-by: greentext2 <112735219+greentext2@users.noreply.github.com> commit 91e826e Author: Sebastian Aigner <SebastianAigner@users.noreply.github.com> Date: Sun Sep 4 10:22:54 2022 +0200 Add CORS headers to dream server to ease integration with third-party web interfaces commit 6266d9e Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 3 15:45:20 2022 -0400 remove stray debugging message commit 138956e Author: greentext2 <112735219+greentext2@users.noreply.github.com> Date: Sat Sep 3 13:38:57 2022 -0500 Update README.md with new Anaconda install steps (invoke-ai#347) pip3 version did not work for me and this is the recommended way to install Anaconda now it seems commit 60be735 Author: Cora Johnson-Roberson <cora.johnson.roberson@gmail.com> Date: Sat Sep 3 14:28:34 2022 -0400 Switch to regular pytorch channel and restore Python 3.10 for Macs. (invoke-ai#301) * Switch to regular pytorch channel and restore Python 3.10 for Macs. Although pytorch-nightly should in theory be faster, it is currently causing increased memory usage and slower iterations: invoke-ai#283 (comment) This changes the environment-mac.yaml file back to the regular pytorch channel and moves the `transformers` dep into pip for now (since it cannot be satisfied until tokenizers>=0.11 is built for Python 3.10). * Specify versions for Pip packages as well. commit d0d95d3 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 3 14:10:31 2022 -0400 make initimg appear in web log commit b90a215 Merge: 1eee811 6270e31 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 3 13:47:15 2022 -0400 Merge branch 'prixt-seamless' into development commit 6270e31 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 3 13:46:29 2022 -0400 add credit to prixt for seamless circular tiling commit a01b7bd Merge: 1eee811 9d88abe Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 3 13:43:04 2022 -0400 add web interface for seamless option commit 1eee811 Merge: 64eca42 fb857f0 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 3 12:33:39 2022 -0400 Merge branch 'development' of github.com:lstein/stable-diffusion into development commit 64eca42 Merge: 9130ad7 21a1f68 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 3 12:33:05 2022 -0400 Merge branch 'main' into development * brings in small documentation fixes that were added directly to main during release tweaking. commit fb857f0 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 3 12:07:07 2022 -0400 fix typo in docs commit 9d88abe Author: prixt <paraxite@naver.com> Date: Sat Sep 3 22:42:16 2022 +0900 fixed typo commit a61e49b Author: prixt <paraxite@naver.com> Date: Sat Sep 3 22:39:35 2022 +0900 * Removed unnecessary code * Added description about --seamless commit 02bee4f Author: prixt <paraxite@naver.com> Date: Sat Sep 3 16:08:03 2022 +0900 added --seamless tag logging to normalize_prompt commit d922b53 Author: prixt <paraxite@naver.com> Date: Sat Sep 3 15:13:31 2022 +0900 added seamless tiling mode and commands
Okay, so I've seen @lstein has added
x = x.contiguous() if x.device.type == 'mps' else x
to ldm/modules/attention.py in the doggettx-optimizations branch
but there's another error happening how
KeyError: 'active_bytes.all.current'
and this has to do with this function in attention.py
Which is basically the code that detects your free memory, and then splits the softmax operation in steps, to allow to generate larger images.
Now, because we are on Mac, I'm not sure @lstein can help us much (unless he has one around), but I open this issue for anyone that wants to collaborate in porting this functionality to M1
The text was updated successfully, but these errors were encountered: