How can I get correct ip adapter image embeds? I got 4D tensors and I cannnot use it. #7168

dai-ichiro · 2024-03-01T08:56:39Z

Describe the bug

IP Adapter image embed should be 3D tensors. But I got 4D tensors.

Reproduction

import torch
from diffusers import AutoPipelineForText2Image, DDIMScheduler
from diffusers.utils import load_image

pipeline = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)
pipeline.load_ip_adapter(
    "h94/IP-Adapter",
    subfolder="sdxl_models",
    weight_name=[
        "ip-adapter-plus_sdxl_vit-h.safetensors",
        "ip-adapter-plus-face_sdxl_vit-h.safetensors"
    ] ,
    image_encoder_folder="models/image_encoder"
)
pipeline.set_ip_adapter_scale([0.7, 0.3])
pipeline.enable_model_cpu_offload()

face_image = load_image("https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/women_input.png")
style_folder = "https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/style_ziggy"
style_images = [load_image(f"{style_folder}/img{i}.png") for i in range(10)]

image_embeds = pipeline.prepare_ip_adapter_image_embeds(
    ip_adapter_image=[style_images, face_image],
    ip_adapter_image_embeds=None,
    device="cuda",
    num_images_per_prompt=1,
    do_classifier_free_guidance=True
)
torch.save(image_embeds, "image_embeds.ipadpt")

print(f"type: {type(image_embeds)}")
print(f"len: {len(image_embeds)}")
for embeds in image_embeds:
    print(f"shape: {embeds.shape}")

outputs is

type: <class 'list'>
len: 2
shape: torch.Size([2, 10, 257, 1280])
shape: torch.Size([2, 1, 257, 1280])

3D tensors is preferred, but 4D can be obtained. And I cannot use it.

import torch
from diffusers import AutoPipelineForText2Image, DDIMScheduler

pipeline = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
    variant="fp16"
)

pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)

pipeline.load_ip_adapter(
    "h94/IP-Adapter",
    subfolder="sdxl_models",
    weight_name=[
        "ip-adapter-plus_sdxl_vit-h.safetensors",
        "ip-adapter-plus-face_sdxl_vit-h.safetensors"
    ],
    image_encoder_folder=None
)
pipeline.set_ip_adapter_scale([0.7, 0.8])

pipeline.to("cuda")

image_embeds_fromfile =  torch.load("image_embeds.ipadpt")

generator = torch.Generator(device="cpu").manual_seed(2024)
image = pipeline(
    prompt="a woman",
    ip_adapter_image_embeds=image_embeds_fromfile,
    negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality", 
    num_inference_steps=50,
    guidance_scale = 0,
    num_images_per_prompt=1,
    generator=generator,
).images[0]
image.save("result_from_image_embeds.png")

Logs

ValueError: `ip_adapter_image_embeds` has to be a list of 3D tensors but is 4D

System Info

diffusers version: 0.27.0.dev0
Platform: Windows-10-10.0.22631-SP0
Python version: 3.11.6
PyTorch version (GPU?): 2.2.0+cu118 (True)
Huggingface_hub version: 0.21.3
Transformers version: 4.38.1
Accelerate version: 0.27.2
xFormers version: not installed
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: No

Who can help?

@sayakpaul
@yiyixuxu

The text was updated successfully, but these errors were encountered:

asomoza · 2024-03-01T09:32:31Z

I just noticed that I didn't test the embeds with the PLUS versions, this issue is because the shapes are different for those, in the meantime the embeds will work only with the normal IP Adapter.

sayakpaul · 2024-03-01T10:41:25Z

Hmm the examples here (#7016) are all 3D tensors. Did we expect to support Plus @yiyixuxu?

yiyixuxu · 2024-03-01T19:08:15Z

@sayakpaul
yes we do and it's a bug i made

asomoza · 2024-03-02T18:51:55Z

@yiyixuxu

I have a fix for this since I was using them for my post and wanted to try the latest changes, should I create a PR?

yiyixuxu · 2024-03-02T19:32:44Z

@asomoza
yes sure!

elismasilva · 2024-05-06T23:26:11Z

how can i convert an embed 4d to 3d tensor embed?

dai-ichiro added the bug Something isn't working label Mar 1, 2024

asomoza mentioned this issue Mar 2, 2024

[ip-adapter] fix problem using embeds with the plus version of ip adapters #7189

Merged

yiyixuxu closed this as completed in #7189 Mar 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I get correct ip adapter image embeds? I got 4D tensors and I cannnot use it. #7168

How can I get correct ip adapter image embeds? I got 4D tensors and I cannnot use it. #7168

dai-ichiro commented Mar 1, 2024 •

edited

Loading

asomoza commented Mar 1, 2024

sayakpaul commented Mar 1, 2024

yiyixuxu commented Mar 1, 2024

asomoza commented Mar 2, 2024

yiyixuxu commented Mar 2, 2024

elismasilva commented May 6, 2024

How can I get correct ip adapter image embeds? I got 4D tensors and I cannnot use it. #7168

How can I get correct ip adapter image embeds? I got 4D tensors and I cannnot use it. #7168

Comments

dai-ichiro commented Mar 1, 2024 • edited Loading

Describe the bug

Reproduction

Logs

System Info

Who can help?

asomoza commented Mar 1, 2024

sayakpaul commented Mar 1, 2024

yiyixuxu commented Mar 1, 2024

asomoza commented Mar 2, 2024

yiyixuxu commented Mar 2, 2024

elismasilva commented May 6, 2024

dai-ichiro commented Mar 1, 2024 •

edited

Loading