refactor: move model helper function in pipeline to a mixin class #6571

ultranity · 2024-01-14T16:54:01Z

What does this PR do?

move model helper function in pipeline to EfficiencyMixin (better name wanted) class
seperate following func to a mixin class
- disable_freeu()
- disable_vae_slicing()
- disable_vae_tiling()
- enable_freeu()
- enable_vae_slicing()
- enable_vae_tiling()
- fuse_qkv_projections()
- unfuse_qkv_projections()
- upcast_vae()

those function are only related with VAE and UNet models, not pipeline impl, so let us move them to a mixin class instead copy it everywhere(just like what have been done to xformers_memory_efficient_attention or attention_slice)

alternative design:

implement a metaclass to register those helper when condition satisfied (when there is VAE, UNet component)
move those function to VAE, UNet model and register to pipeline during init (by introducing a property such as _pipeline_helper_functions)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@patrickvonplaten and @sayakpaul

sayakpaul · 2024-01-15T02:01:10Z

I am not comfortable changing a legacy structure here. So, I will let @patrickvonplaten comment.

patrickvonplaten · 2024-01-15T14:54:01Z

src/diffusers/pipelines/pipeline_utils.py

@@ -2104,3 +2106,123 @@ def set_attention_slice(self, slice_size: Optional[int]):

        for module in modules:
            module.set_attention_slice(slice_size)
+
+
+class EfficiencyMixin:


Can we maybe rename this:

Suggested change

class EfficiencyMixin:

class StableDiffusionMixin:

But seems like it doesn't just affect the Stable Diffusion family, though.

If a similar mixin class were to be designed what would be the process? It might be better to have a base EfficiencyMixin class and then use it as a subclass to write more pipeline-specific classes such as StableDiffusionEfficiencyMixin.

I don't like the StableDiffusionMixin name -- it's uninformative and confusing in light of DiffusionPipeline.

patrickvonplaten

I think this is actually a very nice change! Could we maybe rename the mixin to StableDiffusionMixin as all functions that are moved into the mixin are only applicable to stable-diffusion-like models.

DN6 · 2024-01-15T14:57:42Z

I don't know if EfficiencyMixin is the best way to abstract away these methods. IMO it doesn't really communicate what functionality is actually being added to the object. Also given that not all pipelines have UNets or VAEs, we can't just add the Mixin to every pipeline.

DN6 · 2024-01-15T15:00:22Z

examples/community/composable_stable_diffusion.py

-            cpu_offload(self.safety_checker.vision_model, device)
-
-    @property
-    def _execution_device(self):


Would leave this as is in the Pipeline for now and not add to the Mixin.

I apologize some of those changes should be spilt into a single commit(like: remove unnecessary overload ...), but as they have been inherited from DiffusionPipeline those function enable_sequential_cpu_offload, _execution_device, enable_attention_slicing could be removed without changing any behaviour?

examples/community/gluegen.py

ultranity · 2024-01-15T16:43:51Z

I think this is actually a very nice change! Could we maybe rename the mixin to StableDiffusionMixin as all functions that are moved into the mixin are only applicable to stable-diffusion-like models.

Acceptable name, and also let's decide if add this mixin for MusicLDMPipeline and RDMPipeline (both have vae and unet).

BTW, as those function are only related with VAE and UNet models, this mixin class is not limited to Stable Diffusion pipeline as long as it contains VAE and UNet.

Further split it into VAEMixin and UNetMixin seems to be too tedious, though it will make them available to more pipeline like pixart-alpha.

I don't know if EfficiencyMixin is the best way to abstract away these methods. IMO it doesn't really communicate what functionality is actually being added to the object. Also given that not all pipelines have UNets or VAEs, we can't just add the Mixin to every pipeline.

That's why I'm also thinking about some alternative design, but for now none of them seems to be perfect. For example a more direct(but not that friendly) way is to deprecate those functions and encourage users call them directly through models lol.

BTW we could just add them to DiffusionPipeline if all pipelines have UNets or VAEs, but unfortunately not. That's why I propose this Mixin class.

Again, looking forward to a better design!

HuggingFaceDocBuilderDev · 2024-01-15T16:54:40Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sayakpaul

Looks nice!

I have a major design question but apart from that, things look good!

github-actions · 2024-02-14T15:04:51Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

yiyixuxu

I think we should only add this Mixin to pipelines that support ALL four methods (freeu, vae_slicing, vae_tiling, fuse_qkv_projection), no? Not sure if it is the case here. I think maybe some of the pipelines only support some of the methods e.g. AudioLDM

We should add a TexterMixin that runs fast tests for all pipelines inheriting from this mixin for all four methods; It is also the easiest way to find out if they all support these methods . similar to what's done here #6862

yiyixuxu · 2024-02-19T03:59:13Z

gentle pin @ultranity
I really like this PR and would very much love to see this merged soon :)
let me know if you need any help!

ultranity · 2024-02-19T12:55:29Z

gentle pin @ultranity I really like this PR and would very much love to see this merged soon :) let me know if you need any help!

Thanks!! First I'm still open for a better Mixin name :)

I think we should only add this Mixin to pipelines that support ALL four methods (freeu, vae_slicing, vae_tiling, fuse_qkv_projection), no? Not sure if it is the case here. I think maybe some of the pipelines only support some of the methods e.g. AudioLDM

Technically, any pipeline with UNet2DConditionModel Unet component will support freeu and fuse_qkv_projection, and any pipeline with AutoencoderKL VAE component will support vae_slicing, vae_tiling, fuse_qkv_projection. Which is basically the reason why this PR exists.
FreeU should also 'work' in AudioLDM pipeline though it's effect is not well explored in non-vision case.

We should add a TexterMixin that runs fast tests for all pipelines inheriting from this mixin for all four methods; It is also the easiest way to find out if they all support these methods . similar to what's done here #6862

I will check if this TesterMixin would help

ultranity · 2024-02-19T14:27:02Z

Rebased this PR on current main branch (sorry for #6871) and added a initial PipelineEfficiencyFunctionTesterMixin

yiyixuxu · 2024-02-19T21:15:16Z

@ultranity
I think StableDiffusionMixin works. If a pipeline has a UNet2DConditionModel and AutoencoderK, I would consider stable diffusion family member, no?

ultranity · 2024-02-25T04:09:02Z

b3c3de0 fixed a bug in fuse_projections to:

prevent the case where a self attention block have self.cross_attention_dim != self.query_dim
use bias when it's used

Though we might should make this fix in a seperate PR but it is causing test for text_to_video and i2vgen_xl fail so I fixed it here @sayakpaul

ultranity · 2024-02-27T01:54:04Z

@yiyixuxu @sayakpaul any updates?

yiyixuxu

left a few nits,
thanks a ton for working on this!
can merge once these final comments are addressed and @sayakpaul confirm the change we made to Attention class is ok

examples/community/pipeline_sdxl_style_aligned.py

examples/community/stable_diffusion_controlnet_img2img.py

examples/community/stable_diffusion_controlnet_inpaint.py

examples/community/stable_diffusion_controlnet_inpaint_img2img.py

yiyixuxu · 2024-02-28T03:14:40Z

src/diffusers/models/attention_processor.py

@@ -116,6 +116,8 @@ def __init__(
        super().__init__()
        self.inner_dim = out_dim if out_dim is not None else dim_head * heads
        self.query_dim = query_dim
+        self.use_bias = bias
+        self.is_cross_attention = cross_attention_dim is not None


cc @sayakpaul
the change here looks good to me - can you take a look and confirm if it's ok?

sayakpaul · 2024-02-28T04:41:38Z

src/diffusers/models/attention_processor.py

-            self.to_qkv = self.linear_cls(in_features, out_features, bias=False, device=device, dtype=dtype)
+            self.to_qkv = self.linear_cls(in_features, out_features, bias=self.use_bias, device=device, dtype=dtype)


I think it's safe to always make it to False because attention layers don't use bias.

Then why Attention block need a bias param when init?
And it actually make test fails in some case where attention_bias==True.

You mean fused projections make some of the tests fail when attention bias is true?

Yes, the results will no longer be same when bias is enabled in original Attention but fused projections do not have bias

As mentioned by @yiyixuxu as well if we decide to not add projection fusion to the other models, then we don't have this case anymore. Am I right?

sayakpaul · 2024-02-28T04:42:44Z

src/diffusers/models/unets/unet_3d_condition.py

@@ -503,6 +504,44 @@ def disable_freeu(self):
                if hasattr(upsample_block, k) or getattr(upsample_block, k, None) is not None:
                    setattr(upsample_block, k, None)

+    # Copied from diffusers.models.unets.unet_2d_condition.UNet2DConditionModel.fuse_qkv_projections
+    def fuse_qkv_projections(self):


I think we don't have a use case for projection fusion yet for the 3D UNet, yet. So, it's better to not bloat the codebase here.

sayakpaul · 2024-02-28T04:43:03Z

src/diffusers/models/unets/unet_i2vgen_xl.py

@@ -474,6 +474,44 @@ def disable_freeu(self):
                if hasattr(upsample_block, k) or getattr(upsample_block, k, None) is not None:
                    setattr(upsample_block, k, None)

+    # Copied from diffusers.models.unets.unet_2d_condition.UNet2DConditionModel.fuse_qkv_projections


Same let's not add here since we don't have the use case yet.

ultranity · 2024-02-28T04:43:14Z

tests/pipelines/test_pipelines_common.py

        inputs["return_dict"] = False
        output_2 = pipe(**inputs)

-        assert np.abs(output_2[0].flatten() - output_1[0].flatten()).max() < 3e-3


Here when I was testing I2V Gen Pipeline, I found that this pipeline result is not that reproducible, i.e. even with same inputs both from inputs = self.get_dummy_inputs(device) and disable any changes like pipe.enable_vae_slicing()
by comment it out, the result still failed to pass the check. But I have not figure out why. @yiyixuxu Any idea?

sayakpaul · 2024-02-28T04:43:15Z

src/diffusers/models/unets/unet_motion_model.py

@@ -701,6 +702,44 @@ def disable_freeu(self) -> None:
                if hasattr(upsample_block, k) or getattr(upsample_block, k, None) is not None:
                    setattr(upsample_block, k, None)

+    # Copied from diffusers.models.unets.unet_2d_condition.UNet2DConditionModel.fuse_qkv_projections


sayakpaul · 2024-02-28T04:43:42Z

src/diffusers/models/attention_processor.py

@@ -697,27 +699,32 @@ def norm_encoder_hidden_states(self, encoder_hidden_states: torch.Tensor) -> tor

    @torch.no_grad()
    def fuse_projections(self, fuse=True):
-        is_cross_attention = self.cross_attention_dim != self.query_dim


Why can't we keep it as is?

Why we need to left it with ambiguity when we could actually check if an Attention Block is_cross_attention during init?
It's not True that all Cross Attention have to meet the self.cross_attention_dim != self.query_dim constraint.
For example in I2V Gen case, not some self attention bolck have self.cross_attention_dim == self.query_dim

this is actually a good change regardless if we want to support I2V Gen or not

sayakpaul

Left some comments.

sayakpaul · 2024-02-28T04:55:14Z

My only concern is that we don't have to support projection fusion for the all the UNets right now as we don't if that's gonna improve the end performance with torch.compile(). The feature is experimental. But if others think that it's okay adding that, I don't have any problems. Cc: @yiyixuxu

yiyixuxu · 2024-02-28T05:03:16Z

@sayakpaul

can we confirm the classes we should add this mixin to

vae: AutoencoderKL, AutoencoderTiny
unet: UNet2DConditionModel - do we include 3D?

yiyixuxu · 2024-02-28T05:05:11Z

I also think we don't need to support these we are not current supporting

sayakpaul · 2024-02-28T05:08:38Z

@sayakpaul

can we confirm the classes we should add this mixin to

vae: AutoencoderKL, AutoencoderTiny unet: UNet2DConditionModel - do we include 3D?

I think we can, but we need to make sure that we are not allowing any unsupported method.

yiyixuxu · 2024-02-28T05:34:18Z

tests/pipelines/test_pipelines_common.py

+        self.assertTrue(hasattr(pipe, "vae") and isinstance(pipe.vae, (AutoencoderKL, AutoencoderTiny)))
+        self.assertTrue(
+            hasattr(pipe, "unet")
+            and isinstance(pipe.unet, (UNet2DConditionModel, UNet3DConditionModel, I2VGenXLUNet, UNetMotionModel))


@sayakpaul we decide which Unet we support here

Suggested change

and isinstance(pipe.unet, (UNet2DConditionModel, UNet3DConditionModel, I2VGenXLUNet, UNetMotionModel))

and isinstance(pipe.unet, UNet2DConditionModel)

That sounds neat!

yiyixuxu · 2024-02-28T06:11:27Z

@ultranity
can we make final changes to only add it to

vae: AutoencoderKL, AutoencoderTiny
unet: UNet2DConditionModel

sorry about the back-and-forth 🥺

ultranity · 2024-02-28T06:47:50Z

@ultranity can we make final changes to only add it to

vae: AutoencoderKL, AutoencoderTiny unet: UNet2DConditionModel

sorry about the back-and-forth 🥺

As we have passed all test for UNet3DConditionModel, I2VGenXLUNet, UNetMotionModel, I hope we can keep them for now, for simplicity, or we need to disable fuse_qkv_projections on those unet pipeline by checking self.unet type.

We may reduce bloat code by introducing some interface or plugin mechanics in a future PR.

yiyixuxu · 2024-02-28T19:26:26Z

@ultranity
I'm ok with this plan. Thanks for your PR

ultranity changed the title ~~refactor: move model helper function in pipeline to a mixin class~~ [WIP]refactor: move model helper function in pipeline to a mixin class Jan 15, 2024

patrickvonplaten reviewed Jan 15, 2024

View reviewed changes

patrickvonplaten approved these changes Jan 15, 2024

View reviewed changes

patrickvonplaten requested review from DN6 and yiyixuxu January 15, 2024 14:55

DN6 reviewed Jan 15, 2024

View reviewed changes

examples/community/gluegen.py Show resolved Hide resolved

ultranity changed the title ~~[WIP]refactor: move model helper function in pipeline to a mixin class~~ refactor: move model helper function in pipeline to a mixin class Jan 17, 2024

patrickvonplaten requested a review from sayakpaul January 17, 2024 10:24

sayakpaul reviewed Jan 17, 2024

View reviewed changes

a-r-r-o-w mentioned this pull request Jan 19, 2024

[refactor] FreeInit #6644

Closed

github-actions bot added the stale Issues that haven't received updates label Feb 14, 2024

yiyixuxu removed the stale Issues that haven't received updates label Feb 16, 2024

yiyixuxu reviewed Feb 16, 2024

View reviewed changes

yiyixuxu added the refactor label Feb 19, 2024

ultranity added 5 commits February 19, 2024 21:17

move model helper function in pipeline to EfficiencyMixin

130be1d

deduplicate functions replaced by EfficiencyMixin

ec74982

add mixin to rdm & restore audioldm2 & fix quality checks

4a7fc38

rebase on main branch

cc4f805

init PipelineEfficiencyFunctionTesterMixin

fc71e97

ultranity force-pushed the efficiency_pipe_util branch from fb94f42 to fc71e97 Compare February 19, 2024 14:23

fix fuse_projection by check is_cross_attention when init

b3c3de0

Merge branch 'main' into efficiency_pipe_util

9920fc9

sayakpaul requested a review from yiyixuxu February 27, 2024 03:28

sayakpaul and others added 4 commits February 27, 2024 08:58

Merge branch 'main' into efficiency_pipe_util

7cdff34

use get_dummy_inputs for test_vae_tiling and test_freeu

994299c

fix I2V gen test error

a076831

Merge branch 'main' into efficiency_pipe_util

661b1b5

yiyixuxu approved these changes Feb 28, 2024

View reviewed changes

ultranity added 2 commits February 28, 2024 12:36

add missing StableDiffusionMixin

0fd684b

Merge branch 'main' into efficiency_pipe_util

4bf6b55

sayakpaul reviewed Feb 28, 2024

View reviewed changes

ultranity commented Feb 28, 2024

View reviewed changes

sayakpaul reviewed Feb 28, 2024

View reviewed changes

yiyixuxu reviewed Feb 28, 2024

View reviewed changes

yiyixuxu merged commit fa633ed into huggingface:main Feb 28, 2024
15 checks passed

ultranity deleted the efficiency_pipe_util branch February 29, 2024 01:13

sayakpaul mentioned this pull request Mar 30, 2024

Reduce huge amounts of code duplication #6153

Closed

		self.to_qkv = self.linear_cls(in_features, out_features, bias=False, device=device, dtype=dtype)
		self.to_qkv = self.linear_cls(in_features, out_features, bias=self.use_bias, device=device, dtype=dtype)

	and isinstance(pipe.unet, (UNet2DConditionModel, UNet3DConditionModel, I2VGenXLUNet, UNetMotionModel))
	and isinstance(pipe.unet, UNet2DConditionModel)

refactor: move model helper function in pipeline to a mixin class #6571

refactor: move model helper function in pipeline to a mixin class #6571

Conversation

ultranity commented Jan 14, 2024

What does this PR do?

Before submitting

Who can review?

sayakpaul commented Jan 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickvonplaten left a comment

Choose a reason for hiding this comment

DN6 commented Jan 15, 2024

Choose a reason for hiding this comment

ultranity Jan 15, 2024 • edited Loading

Choose a reason for hiding this comment

ultranity commented Jan 15, 2024

HuggingFaceDocBuilderDev commented Jan 15, 2024

sayakpaul left a comment

Choose a reason for hiding this comment

github-actions bot commented Feb 14, 2024

yiyixuxu left a comment

Choose a reason for hiding this comment

yiyixuxu commented Feb 19, 2024

ultranity commented Feb 19, 2024

ultranity commented Feb 19, 2024

yiyixuxu commented Feb 19, 2024

ultranity commented Feb 25, 2024

ultranity commented Feb 27, 2024

yiyixuxu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yiyixuxu Feb 28, 2024 • edited Loading

Choose a reason for hiding this comment

sayakpaul left a comment

Choose a reason for hiding this comment

sayakpaul commented Feb 28, 2024

yiyixuxu commented Feb 28, 2024 • edited Loading

yiyixuxu commented Feb 28, 2024

sayakpaul commented Feb 28, 2024

yiyixuxu Feb 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yiyixuxu commented Feb 28, 2024

ultranity commented Feb 28, 2024

yiyixuxu commented Feb 28, 2024

ultranity Jan 15, 2024 •

edited

Loading

yiyixuxu Feb 28, 2024 •

edited

Loading

yiyixuxu commented Feb 28, 2024 •

edited

Loading

yiyixuxu Feb 28, 2024 •

edited

Loading