Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Fix float/int guidance scale not working in StableVideoDiffusionPipeline #7143

Merged
merged 4 commits into from
Mar 6, 2024

Conversation

JinayJain
Copy link
Contributor

@JinayJain JinayJain commented Feb 29, 2024

What does this PR do?

fa3c86b caused a regression in the support for disabling CFG. Specifically, do_classifier_free_guidance was no longer returning a boolean value and causing the following error when using max_guidance_scale = 1.

    507 latent_model_input = self.scheduler.scale_model_input(latent_model_input, t)
    509 # Concatenate image_latents over channels dimention
--> 510 latent_model_input = torch.cat([latent_model_input, image_latents], dim=2)
    512 # predict the noise residual
    513 noise_pred = self.unet(
    514     latent_model_input,
    515     t,
   (...)
    518     return_dict=False,
    519 )[0]

RuntimeError: Sizes of tensors must match except in dimension 2. Expected size 1 but got size 2 for tensor number 1 in the list.

Adding this change allows the pipeline to run without error.

Before submitting

Who can review?

Tagging @patrickvonplaten who wrote the original commit.

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@sayakpaul
Copy link
Member

Nice catch. We should definitely add a test case for this as well here: https://github.com/huggingface/diffusers/blob/main/tests/pipelines/stable_video_diffusion/test_stable_video_diffusion.py.

@yiyixuxu WDYT?

@sayakpaul sayakpaul requested a review from DN6 March 4, 2024 03:51
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@DN6 DN6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good 👍🏽 I think a test to check disabling cfg would be a nice addition.

@JinayJain
Copy link
Contributor Author

@DN6 Added a test, let me know how it looks.

@JinayJain JinayJain requested a review from DN6 March 5, 2024 17:29
Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking solid. Thanks!

@sayakpaul
Copy link
Member

@DN6 feel free to merge once you're good!

@JinayJain
Copy link
Contributor Author

Side note: I was looking at the tests and see that the test config sets both height and width 32. It's probably a good idea to change those to different numbers to catch shape errors where the height/width are switched mistakenly.

@sayakpaul
Copy link
Member

Welcome you to add them in a separate PR.

@DN6 DN6 merged commit 1bc0d37 into huggingface:main Mar 6, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants