[ip-adapter] fix problem using embeds with the plus version of ip adapters #7189

asomoza · 2024-03-02T21:47:07Z

What does this PR do?

Allows the use of 4D tensors to be able to pass embeds made with the IP Adapter PLUS versions

How to test:

import torch

from diffusers import AutoPipelineForText2Image
from diffusers.utils import load_image


pipeline = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16"
)

pipeline.load_ip_adapter(
    "h94/IP-Adapter",
    subfolder="sdxl_models",
    weight_name=[
        "ip-adapter_sdxl_vit-h.safetensors",
        "ip-adapter-plus_sdxl_vit-h.safetensors",
        "ip-adapter-plus-face_sdxl_vit-h.safetensors",
    ],
    image_encoder_folder="models/image_encoder",
)
pipeline.set_ip_adapter_scale([0.1, 0.7, 0.3])
pipeline.to("cuda")

face_image = load_image("https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/women_input.png")
style_folder = "https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/style_ziggy"
style_images = [load_image(f"{style_folder}/img{i}.png") for i in range(10)]

prompt = "wonderwoman"
num_images_per_prompt = 1
guidance_scale = 7.5
do_classifier_free_guidance = guidance_scale > 1


with torch.no_grad():
    image_embeds = pipeline.prepare_ip_adapter_image_embeds(
        [face_image, style_images, face_image],
        None,
        "cuda",
        num_images_per_prompt,
        do_classifier_free_guidance,
    )

image = pipeline(
    prompt=prompt,
    ip_adapter_image_embeds=image_embeds,
    negative_prompt="",
    guidance_scale=guidance_scale,
    num_images_per_prompt=num_images_per_prompt,
).images[0]
image.save("result.png")

Who can review?

@yiyixuxu @sayakpaul

Also cc: @fabiorigano because of #7186

src/diffusers/pipelines/animatediff/pipeline_animatediff.py

sayakpaul

Nice! The changes look very clean and simple to me. Thank you!

Should we maybe also add a small note about this support in the IP-Adapter guide? @yiyixuxu WDYT?

HuggingFaceDocBuilderDev · 2024-03-03T04:22:46Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

asomoza · 2024-03-03T05:42:35Z

Tested all the combinations I could think of with SDXL, the image isn't that nice but is faster to just use all three ip adapters at the same time ^^

Do the SD 1.5 versions have different dimensions? there's six of them.

sayakpaul · 2024-03-03T05:50:59Z

Do the SD 1.5 versions have different dimensions? there's six of them.

Could do a quick check on the checkpoints maybe?

asomoza · 2024-03-03T06:26:20Z

Could do a quick check on the checkpoints maybe?

I did and it worked with almost all of them except this one:
ip-adapter_sd15_vit-G.safetensors

RuntimeError: mat1 and mat2 shapes cannot be multiplied (32x1280 and 1024x3072)

This also reminded me that there's one with the big image encoder for the SDXL ones, so I tested it and it didn't work either:
ip-adapter_sdxl.safetensors

RuntimeError: mat1 and mat2 shapes cannot be multiplied (82240x1664 and 1280x1280)

but those errors are not related to this PR

sayakpaul · 2024-03-03T06:27:59Z

This also reminded me that there's one with the big image encoder for the SDXL ones, so I tested it and it didn't work either:
ip-adapter_sdxl.safetensors

I see. I think that needs fixing then. Would you mind opening an issue for this and we can work on that in a separate PR?

asomoza · 2024-03-03T06:42:05Z

I see. I think that needs fixing then. Would you mind opening an issue for this and we can work on that in a separate PR?

It was an obvious mistake on my part, since both of those use a different image encoder and I was using them in combination with the normal ones, that was the error, I can't mix the adapters that use different image encoders since we load one for all of them.

They work as expected if I use them alone.

yiyixuxu

thank you!

asomoza added 3 commits March 2, 2024 16:54

initial

721159d

check_inputs fix to the rest of pipelines

7df53ca

add fix for no cfg too

7b08c7a

sayakpaul reviewed Mar 3, 2024

View reviewed changes

src/diffusers/pipelines/animatediff/pipeline_animatediff.py Outdated Show resolved Hide resolved

sayakpaul approved these changes Mar 3, 2024

View reviewed changes

Merge branch 'main' into fix-ip-adapter-plus-embeds

539f32a

use of variable

582a279

Merge branch 'main' into fix-ip-adapter-plus-embeds

1edd179

yiyixuxu approved these changes Mar 3, 2024

View reviewed changes

yiyixuxu merged commit 001b140 into huggingface:main Mar 3, 2024
15 checks passed

asomoza deleted the fix-ip-adapter-plus-embeds branch March 5, 2024 05:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ip-adapter] fix problem using embeds with the plus version of ip adapters #7189

[ip-adapter] fix problem using embeds with the plus version of ip adapters #7189

asomoza commented Mar 2, 2024 •

edited

Loading

sayakpaul left a comment

HuggingFaceDocBuilderDev commented Mar 3, 2024

asomoza commented Mar 3, 2024

sayakpaul commented Mar 3, 2024

asomoza commented Mar 3, 2024

sayakpaul commented Mar 3, 2024

asomoza commented Mar 3, 2024

yiyixuxu left a comment

[ip-adapter] fix problem using embeds with the plus version of ip adapters #7189

[ip-adapter] fix problem using embeds with the plus version of ip adapters #7189

Conversation

asomoza commented Mar 2, 2024 • edited Loading

What does this PR do?

Who can review?

sayakpaul left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Mar 3, 2024

asomoza commented Mar 3, 2024

sayakpaul commented Mar 3, 2024

asomoza commented Mar 3, 2024

sayakpaul commented Mar 3, 2024

asomoza commented Mar 3, 2024

yiyixuxu left a comment

Choose a reason for hiding this comment

asomoza commented Mar 2, 2024 •

edited

Loading