Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Depth2Img model support: resolves #5372, partially addresses #5011 #5542

Merged
merged 1 commit into from
Dec 10, 2022

Conversation

JaySmithWpg
Copy link
Contributor

@JaySmithWpg JaySmithWpg commented Dec 9, 2022

What?

Support for Stable Diffusion 2.0's Depth2Image LatentDepth2ImageDiffusion.

Why?

It's a great feature that preserves the overall form of an image. It's very useful for style transfers in particular.

Doesn't depthmap2mask already do this?

No. depthmap2mask is a great script for masking out the foreground or background of an image when running img2img, but it is no substitute for a depth-aware model that can repaint the entire scene.

How does it work?

Instructions:

  1. Download the 512-depth-ema.ckpt model and place it in models/Stable-diffusion
  2. Download the config and place it in the same folder as the checkpoint
  3. Rename the config to 512-depth-ema.yaml
  4. Start Stable-Diffusion-Webui, select the 512-depth-ema checkpoint and use img2img as you normally would.

Tips:

Since the structure of the image is preserved in depth2img, you can get great results with the denoising strength set very high.

Caution:

It probably goes without saying, but you will get errors if you try to use the Depth2Image model for txt2img. It will still work with inpainting.

Copy link

@LieDeath LieDeath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job

@ruradium
Copy link

ruradium commented Dec 9, 2022

Does it support 768 model natively?

@LieDeath
Copy link

LieDeath commented Dec 9, 2022

Does it support 768 model natively?

?
stabilityai never releases a "768-depth" model. It uses the original 512-depth model instead.

@JaySmithWpg
Copy link
Contributor Author

JaySmithWpg commented Dec 9, 2022

I found a bug where it fails if the midas/models directory doesn't already exist. It's fixed up now, pardon the late commit and merge.

Does it support 768 model natively?

As LieDeath mentioned, there isn't a 768 model for depth yet. That said, it works well at larger resolutions. The depth map gives it a good structure to follow, similar to how hires fix works.

I can't promise anything, but in theory this should be able to handle other models as long as they are using LatentDepth2ImageDiffusion and MiDaS.

@clockworkwhale
Copy link

clockworkwhale commented Dec 9, 2022

Testing now and it's working for me! Thanks for this, I know a lot of people were excited about this being implemented

Also your OP in this thread has a typo at the time I'm posting this, it says to rename the config file to 512-depth-ema.ckpt. Should say to rename it to 512-depth-ema.yaml

@JaySmithWpg
Copy link
Contributor Author

Great catch on the typo, thanks!

@AugmentedRealityCat
Copy link

I confirm this works.
I confirm this works well.
I confirm this works very very, very well.
I confirm this works almost too well. This is going to completely change the way I use Stable Diffusion.

It's so early that I have a hard time grasping how vast the landscape is behind that door you just opened.
THANK YOU @JaySmithWpg !

@patrickmac110
Copy link

I get this error when trying to load it in the img2img tab:

Loading config from: A:\Desktop\00 AI Images\stable-diffusion-webui\models\Stable-diffusion\SD 2.0\Standard\512-depth-ema.yaml
LatentDepth2ImageDiffusion: Running in eps-prediction mode
DiffusionWrapper has 865.91 M params.
Error verifying pickled file from midas_models/dpt_hybrid-midas-501f0c75.pt:
Traceback (most recent call last):
File "A:\Desktop\00 AI Images\stable-diffusion-webui\modules\safe.py", line 135, in load_with_extra
check_pt(filename, extra_handler)
File "A:\Desktop\00 AI Images\stable-diffusion-webui\modules\safe.py", line 81, in check_pt
with zipfile.ZipFile(filename) as z:
File "C:\Users\Patrick\AppData\Local\Programs\Python\Python310\lib\zipfile.py", line 1249, in init
self.fp = io.open(file, filemode)
FileNotFoundError: [Errno 2] No such file or directory: 'midas_models/dpt_hybrid-midas-501f0c75.pt'

The file may be malicious, so the program is not going to read it.
You can skip this check with --disable-safe-unpickle commandline argument.

Traceback (most recent call last):
File "A:\Desktop\00 AI Images\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 284, in run_predict
output = await app.blocks.process_api(
File "A:\Desktop\00 AI Images\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 982, in process_api
result = await self.call_function(fn_index, inputs, iterator)
File "A:\Desktop\00 AI Images\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 824, in call_function
prediction = await anyio.to_thread.run_sync(
File "A:\Desktop\00 AI Images\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "A:\Desktop\00 AI Images\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "A:\Desktop\00 AI Images\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "A:\Desktop\00 AI Images\stable-diffusion-webui\modules\ui.py", line 1618, in
fn=lambda value, k=k: run_settings_single(value, key=k),
File "A:\Desktop\00 AI Images\stable-diffusion-webui\modules\ui.py", line 1459, in run_settings_single
if not opts.set(key, value):
File "A:\Desktop\00 AI Images\stable-diffusion-webui\modules\shared.py", line 473, in set
self.data_labels[key].onchange()
File "A:\Desktop\00 AI Images\stable-diffusion-webui\modules\call_queue.py", line 15, in f
res = func(*args, **kwargs)
File "A:\Desktop\00 AI Images\stable-diffusion-webui\webui.py", line 63, in
shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: modules.sd_models.reload_model_weights()))
File "A:\Desktop\00 AI Images\stable-diffusion-webui\modules\sd_models.py", line 292, in reload_model_weights
load_model(checkpoint_info)
File "A:\Desktop\00 AI Images\stable-diffusion-webui\modules\sd_models.py", line 260, in load_model
sd_model = instantiate_from_config(sd_config.model)
File "A:\Desktop\00 AI Images\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\util.py", line 79, in instantiate_from_config
return get_obj_from_str(config["target"])(**config.get("params", dict()))
File "A:\Desktop\00 AI Images\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 1689, in init
self.depth_model = instantiate_from_config(depth_stage_config)
File "A:\Desktop\00 AI Images\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\util.py", line 79, in instantiate_from_config
return get_obj_from_str(config["target"])(**config.get("params", dict()))
File "A:\Desktop\00 AI Images\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\midas\api.py", line 153, in init
model, _ = load_model(model_type)
File "A:\Desktop\00 AI Images\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\midas\api.py", line 88, in load_model
model = DPTDepthModel(
File "A:\Desktop\00 AI Images\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\midas\midas\dpt_depth.py", line 105, in init
self.load(path)
File "A:\Desktop\00 AI Images\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\midas\midas\base_model.py", line 13, in load
if "optimizer" in parameters:
TypeError: argument of type 'NoneType' is not iterable

@JaySmithWpg
Copy link
Contributor Author

JaySmithWpg commented Dec 9, 2022

Hi Patrick! I'm not sure if you're the same person I was talking to on Reddit about this, but just in case you're not, that error is what happens if you try to follow the instructions to use the model without first applying the code changes contained in this pull request. If/when this PR is merged into the main code by automatic1111, it should work for you.

@TingTingin
Copy link

TingTingin commented Dec 9, 2022

Is it forced to use dpt_hybrid or is it not possible to use with other models? also what about multi-resolution merging https://github.com/compphoto/BoostingMonocularDepth

@MrCheeze
Copy link
Contributor

As currently implemented, this only allows autogenerated depth maps, right? Definitely going to want to be able to manually provide a depth map too (this guy had a quick hack for it). Not saying that necessarily has to be part of this PR though.

@JaySmithWpg
Copy link
Contributor Author

@TingTingin Currently it's forced to use dpt_hybrid. I couldn't find a nice way to interrogate the model for what it uses at that point in the code, but I might revisit that if I have time tomorrow. Multi-resolution merging looks awesome, but one step at a time.

@MrCheeze , that guy is me! Currently it only allows for auto-generated depth maps, as anything more complicated will require some user interface changes. I really do want to see added support for manual depth maps, they're incredibly powerful. But I won't have much time to implement it myself until after the holidays. I hope somebody beats me to it.

@patrickmac110
Copy link

Hi Patrick! I'm not sure if you're the same person I was talking to on Reddit about this, but just in case you're not, that error is what happens if you try to follow the instructions to use the model without first applying the code changes contained in this pull request. If/when this PR is merged into the main code by automatic1111, it should work for you.

it wasn't me that was talking to you over on Reddit, but thanks for the reply, I'm super new to using git on windows, any chance you could point me in the direction on how to go about temporarily using your fix until it gets implemented?

@AUTOMATIC1111 AUTOMATIC1111 merged commit feeca19 into AUTOMATIC1111:master Dec 10, 2022
@AUTOMATIC1111
Copy link
Owner

This is nice. Seems to work without any issues.

@LieDeath
Copy link

I get this error when trying to load it in the img2img tab:

Loading config from: A:\Desktop\00 AI Images\stable-diffusion-webui\models\Stable-diffusion\SD 2.0\Standard\512-depth-ema.yaml LatentDepth2ImageDiffusion: Running in eps-prediction mode DiffusionWrapper has 865.91 M params. Error verifying pickled file from midas_models/dpt_hybrid-midas-501f0c75.pt: Traceback (most recent call last): File "A:\Desktop\00 AI Images\stable-diffusion-webui\modules\safe.py", line 135, in load_with_extra check_pt(filename, extra_handler) File "A:\Desktop\00 AI Images\stable-diffusion-webui\modules\safe.py", line 81, in check_pt with zipfile.ZipFile(filename) as z: File "C:\Users\Patrick\AppData\Local\Programs\Python\Python310\lib\zipfile.py", line 1249, in init self.fp = io.open(file, filemode) FileNotFoundError: [Errno 2] No such file or directory: 'midas_models/dpt_hybrid-midas-501f0c75.pt'

The file may be malicious, so the program is not going to read it. You can skip this check with --disable-safe-unpickle commandline argument.

Traceback (most recent call last): File "A:\Desktop\00 AI Images\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 284, in run_predict output = await app.blocks.process_api( File "A:\Desktop\00 AI Images\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 982, in process_api result = await self.call_function(fn_index, inputs, iterator) File "A:\Desktop\00 AI Images\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 824, in call_function prediction = await anyio.to_thread.run_sync( File "A:\Desktop\00 AI Images\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "A:\Desktop\00 AI Images\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "A:\Desktop\00 AI Images\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 867, in run result = context.run(func, *args) File "A:\Desktop\00 AI Images\stable-diffusion-webui\modules\ui.py", line 1618, in fn=lambda value, k=k: run_settings_single(value, key=k), File "A:\Desktop\00 AI Images\stable-diffusion-webui\modules\ui.py", line 1459, in run_settings_single if not opts.set(key, value): File "A:\Desktop\00 AI Images\stable-diffusion-webui\modules\shared.py", line 473, in set self.data_labels[key].onchange() File "A:\Desktop\00 AI Images\stable-diffusion-webui\modules\call_queue.py", line 15, in f res = func(*args, **kwargs) File "A:\Desktop\00 AI Images\stable-diffusion-webui\webui.py", line 63, in shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: modules.sd_models.reload_model_weights())) File "A:\Desktop\00 AI Images\stable-diffusion-webui\modules\sd_models.py", line 292, in reload_model_weights load_model(checkpoint_info) File "A:\Desktop\00 AI Images\stable-diffusion-webui\modules\sd_models.py", line 260, in load_model sd_model = instantiate_from_config(sd_config.model) File "A:\Desktop\00 AI Images\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\util.py", line 79, in instantiate_from_config return get_obj_from_str(config["target"])(**config.get("params", dict())) File "A:\Desktop\00 AI Images\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 1689, in init self.depth_model = instantiate_from_config(depth_stage_config) File "A:\Desktop\00 AI Images\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\util.py", line 79, in instantiate_from_config return get_obj_from_str(config["target"])(**config.get("params", dict())) File "A:\Desktop\00 AI Images\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\midas\api.py", line 153, in init model, _ = load_model(model_type) File "A:\Desktop\00 AI Images\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\midas\api.py", line 88, in load_model model = DPTDepthModel( File "A:\Desktop\00 AI Images\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\midas\midas\dpt_depth.py", line 105, in init self.load(path) File "A:\Desktop\00 AI Images\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\midas\midas\base_model.py", line 13, in load if "optimizer" in parameters: TypeError: argument of type 'NoneType' is not iterable

Simple error with download bad file, isn't it?

@JaySmithWpg JaySmithWpg deleted the depth2img branch December 10, 2022 07:10
@JaySmithWpg
Copy link
Contributor Author

Thanks for the merge!
@patrickmak110 now that the change has been approved at merged, things should work for you the next time you update. Just make sure you follow the config and model instructions in the PR description above.

@ilcane87
Copy link

ilcane87 commented Dec 10, 2022

I don't suppose there's any way to merge the depth-enabled model with any pre-SD2.0 models, or some other trick to use this feature on them?

@DJOCKER-FACE
Copy link

Same Error as @LieDeath followed instructions so not really sure about this one.

@LieDeath
Copy link

Same Error as @LieDeath followed instructions so not really sure about this one.

Are you sure??? I have successful run it rightnow-_-|| Have you already updated webui?
successfully_run

@DJOCKER-FACE
Copy link

@LieDeath Fixed yes I did update, Error was caused by dpt_hybrid-midas-501f0c75.pt File got corrupted during the download I believe and no way to find that out had to redownload and now seems fine.

@LieDeath
Copy link

@LieDeath Fixed yes I did update, Error was caused by dpt_hybrid-midas-501f0c75.pt File got corrupted during the download I believe and no way to find that out had to redownload and now seems fine.

Well, OK. 😂

@Hernan-Barrientos
Copy link

I can't use it in the free Colab :(
it run out of ram

Loading weights [d0522d12] from /content/gdrive/MyDrive/sd/stable-diffusion-webui/models/Stable-diffusion/512-depth-ema.ckpt
^C

@wandrzej
Copy link

Hey, thanks a ton for this useful update.
Any idea why it seems to be some "RGB leakage" (for lack of a better term), even with denoising strength set to 1.
the black and white depth map is still strongly present in the final image and if the depth map isn't auto-contrasted the whole image also gets pretty low contrast.
Also practically impossible to get light or colourful background if the depth map goes from white (closest) to black (far)

@wandrzej
Copy link

image
just an example to show what's the problem.

@JaySmithWpg
Copy link
Contributor Author

JaySmithWpg commented Dec 11, 2022 via email

@AugmentedRealityCat
Copy link

I even hacked together a rough proof of concept (somebody else linked to it above)

I found the hack on reddit yesterday and I tried it (successfully modified processing.py) but I could not understand where I should put my depthmap for it to be read.

@JaySmithWpg
Copy link
Contributor Author

JaySmithWpg commented Dec 12, 2022

I even hacked together a rough proof of concept (somebody else linked to it above)

I found the hack on reddit yesterday and I tried it (successfully modified processing.py) but I could not understand where I should put my depthmap for it to be read.

In the hack, there's a line that reads

depth_img = Image.open("/home/jay/Pictures/AI/Turrent Room/depth.png")

Replace that path with where the depthmap is.

That said, this is done entirely at your own risk. Modifying the code has the potential to cause issues with updates to the webui. I only recommend doing this if you're already familiar with git (ie. You can confidently do a git rebase or merge)

@AugmentedRealityCat
Copy link

AugmentedRealityCat commented Dec 13, 2022

Replace that path with where the depthmap is.

This is the key I was missing, but not for the reason you might imagine.
In fact, I did not have that line with the path in the hack I had installed.
The good one - with your line - is over here https://gist.github.com/JaySmithWpg/0dfb716fef567b5fbe8fbebf38dd1101
and the one I used was this one over here: https://gist.github.com/JaySmithWpg/e170dd7078a4414b56f30320d27cdc27
And now that I read the title of that page I see it's for SAVING the Midas depthmap, not for loading a custom one !

So thanks again, I would not have discovered MY own error without your help.

@AugmentedRealityCat
Copy link

AugmentedRealityCat commented Dec 14, 2022

depth_img = Image.open("/home/jay/Pictures/AI/Turrent Room/depth.png")

I managed to get it to work on windows and here are a few hints if others are tempted by the experience.

First, if you are not familiar with diff files, just copy hack.diff in the root folder of your WebUI install, and use the following command: git apply hack.diff
For more info about the use of diff files: https://stackoverflow.com/questions/12320863/how-do-you-take-a-git-diff-file-and-apply-it-to-a-local-branch-that-is-a-copy-o

The second thing is that this was initially made for Linux so you have to use a windows-compatible address to refer to the depth.png file you want to use as a depthmap. There are many ways to achieve that but the easiest is to put the whole address of your file in between the 2 double quotes (") and add the letter r in front of it.

Here is an example of this line as modified for windows:

depth_img = Image.open(r"C:\Users\YOURUSERNAME\SDdepthsrc\depth.png")

@JaySmithWpg
Copy link
Contributor Author

JaySmithWpg commented Jan 7, 2023 via email

@mockinbirdy
Copy link

It depends on what browser you are using, but there's usually a "Save Page As..." option hidden under the file menu somewhere.

I figured it out, thanks.

@fofr
Copy link

fofr commented Jan 17, 2023

I'm seeing some weird issues with depth2img on a Mac, where the depths seems to be random rather than what the image should be:
#6865

@JaySmithWpg any ideas?

Screenshot 2023-01-17 at 22 03 43

@JaySmithWpg
Copy link
Contributor Author

JaySmithWpg commented Jan 18, 2023 via email

@fofr
Copy link

fofr commented Jan 18, 2023

@JaySmithWpg Should the depth maps be being saved somewhere? I'm not seeing any show up in the default img2img-images folder. I was hoping to use that to see just the depth information it thinks is there.

@JaySmithWpg
Copy link
Contributor Author

JaySmithWpg commented Jan 18, 2023

@JaySmithWpg Should the depth maps be being saved somewhere? I'm not seeing any show up in the default img2img-images folder. I was hoping to use that to see just the depth information it thinks is there.

It's not being saved anywhere at the moment, my goal here was to support the depth model without any side effects. That said, it's a pretty easy change to save the depth output if somebody is interested in putting the change together. I have a rough version of that change here, but I haven't tested it against the latest version of the UI. Anybody is welcome to take this and make a PR out of it:
https://gist.github.com/JaySmithWpg/e170dd7078a4414b56f30320d27cdc27

(Disclaimer: I don't recommend or support applying that change manually unless you have a good working knowledge of git. If you don't know how to merge and rebase in git, you might leave your UI in a state where updates fail. My schedule will soon make it impossible for me to support people who make this change and regret it.)

@AugmentedRealityCat
Copy link

Have a look at this Depth Map IO script - it's the solution to not only save the depth map used secretly by the 2.0 depth model but also to plug in your own custom depth map as an input for it - in addition to the standard RGB image for IMG2IMG. It's very useful, and very simple to use.

https://github.com/AnonymousCervine/depth-image-io-for-SDWebui

image

Big thanks to @AnonymousCervine for making this and sharing it with all of us !

@AugmentedRealityCat
Copy link

AugmentedRealityCat commented Jan 23, 2023

I'm seeing some weird issues with depth2img on a Mac

Can you install this Depth-Image-IO script and check the depth map you get out of it ?

https://github.com/AnonymousCervine/depth-image-io-for-SDWebui

This might show us where the problem is coming from. For a similar problem related to 16 bit PNG check this Issue page over there.

@bypaulomeyer
Copy link

What's up guys,

I'm from Brazil and I'm struggling to get help in Portuguese on the subject. I think this is the right place.

I correctly installed the Depthmap2mask and I only have a problem using the dpt_large model. The other models it works.

The following error appears on the CMD screen:

File "C:\Users\bypau\Desktop\SD\stable-diffusion-webui\repositories\midas\midas\base_model.py", line 13, in load
if "optimizer" in parameters:
TypeError: argument of type 'NoneType' is not iterable

I tried to do the procedure that appears at the beginning of this page but it did not solve my problem. Thank you all for the attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.