Add kolors compile #1007

lixiang007666 · 2024-07-11T08:29:02Z

This PR is done:

Add kolors compile&readme.

lixiang007666 · 2024-07-11T08:33:52Z

当前支持的 kolors diffusers 在：
huggingface/diffusers#8812

目前还没有合并。

onediff_diffusers_extensions/examples/kolors/README.md

lixiang007666 · 2024-07-16T06:50:46Z

动态 shape 日志：

oneflow：

oneflow backend compile...
Starting warmup...
Warmup complete.
Warmup time: 39.30 seconds
Generated image saved to kolors_oneflow_compile.png in 3.50 seconds.
Max used CUDA memory : 20.627GiB
Test run with multiple resolutions...
Running at resolution: 1024x1024
Inference time: 4.17 seconds
Running at resolution: 1024x768
Inference time: 2.86 seconds
Running at resolution: 1024x576
Inference time: 2.40 seconds
Running at resolution: 1024x512
Inference time: 2.20 seconds
Running at resolution: 1024x256
Inference time: 1.30 seconds
Running at resolution: 768x1024
Inference time: 2.89 seconds
Running at resolution: 768x768
Inference time: 2.42 seconds
Running at resolution: 768x576
Inference time: 2.10 seconds
Running at resolution: 768x512
Inference time: 1.68 seconds
Running at resolution: 768x256
Inference time: 1.03 seconds
Running at resolution: 576x1024
Inference time: 2.43 seconds
Running at resolution: 576x768
Inference time: 2.10 seconds
Running at resolution: 576x576
Inference time: 1.53 seconds
Running at resolution: 576x512
Inference time: 1.48 seconds
Running at resolution: 576x256
Inference time: 0.93 seconds
Running at resolution: 512x1024
Inference time: 2.22 seconds
Running at resolution: 512x768
Inference time: 1.72 seconds
Running at resolution: 512x576
Inference time: 1.47 seconds
Running at resolution: 512x512
Inference time: 1.34 seconds
Running at resolution: 512x256
Inference time: 0.86 seconds
Running at resolution: 256x1024
Inference time: 1.33 seconds
Running at resolution: 256x768
Inference time: 1.04 seconds
Running at resolution: 256x576
Inference time: 0.93 seconds
Running at resolution: 256x512
Inference time: 0.86 seconds
Running at resolution: 256x256
Inference time: 0.78 seconds

nexfort：

nexfort backend compile...
Starting warmup...
Warmup complete.
Warmup time: 314.58 seconds
Generated image saved to kolors_nexfort_compile.png in 2.31 seconds.
Max used CUDA memory : 19.435GiB
Test run with multiple resolutions...
Running at resolution: 1024x1024
Inference time: 5.95 seconds
Running at resolution: 1024x768
Inference time: 4.04 seconds
Running at resolution: 1024x576
Inference time: 3.48 seconds
Running at resolution: 1024x512
Inference time: 3.13 seconds
Running at resolution: 1024x256
Inference time: 2.39 seconds
Running at resolution: 768x1024
Inference time: 4.09 seconds
Running at resolution: 768x768
Inference time: 3.50 seconds
Running at resolution: 768x576
Inference time: 3.07 seconds
Running at resolution: 768x512
Inference time: 2.45 seconds
Running at resolution: 768x256
Inference time: 2.40 seconds
Running at resolution: 576x1024
Inference time: 3.57 seconds
Running at resolution: 576x768
Inference time: 3.07 seconds
Running at resolution: 576x576
Inference time: 2.49 seconds
Running at resolution: 576x512
Inference time: 2.48 seconds
Running at resolution: 576x256
Inference time: 2.46 seconds
Running at resolution: 512x1024
Inference time: 3.18 seconds
Running at resolution: 512x768
Inference time: 2.51 seconds
Running at resolution: 512x576
Inference time: 2.49 seconds
Running at resolution: 512x512
Inference time: 2.49 seconds
Running at resolution: 512x256
Inference time: 2.46 seconds
Running at resolution: 256x1024
Inference time: 2.49 seconds
Running at resolution: 256x768
Inference time: 2.50 seconds
Running at resolution: 256x576
Inference time: 2.48 seconds
Running at resolution: 256x512
Inference time: 2.47 seconds
Running at resolution: 256x256
Inference time: 2.49 seconds

lixiang007666 · 2024-07-18T06:54:54Z

发现 nexfort 最近的更新有些优化导致速度变慢了很多，我定位下是哪个提交。

lixiang007666 · 2024-07-19T05:21:52Z

发现 nexfort 最近的更新有些优化导致速度变慢了很多，我定位下是哪个提交。

pipe = compile_pipe(
pipe, backend="nexfort", options=options, ignores=['text_encoder'], fuse_qkv_projections=True
)

加了 ignores=['text_encoder'] 后，性能变差很多，不太合理。

于是定位到 https://github.com/siliconflow/nexfort/pull/91 ，发现这个 commit 之前不会出现这种情况。

不过这个问题对这个 PR 的合并没影响。

lixiang007666 · 2024-07-19T05:23:19Z

TODO:

a100 测速。
质量报告。

strint · 2024-07-19T08:54:02Z

加了 ignores=['text_encoder'] 后，性能变差很多，不太合理。

因为 kolors 用的 chatglm，印象中模型会比之前 sdxl 的 text encoder 大，这块开销应该是明显的

onediff_diffusers_extensions/examples/kolors/README.md

onediff_diffusers_extensions/examples/kolors/text_to_image_kolors.py

onediff_diffusers_extensions/examples/kolors/README.md

Add kolors compile

ce95217

lixiang007666 requested a review from strint July 11, 2024 08:35

lixiang007666 added 3 commits July 11, 2024 16:38

Update README.md

8411096

Add files via upload

7a9b26d

Merge branch 'main' into Add_kolors_compile

31be433

strint reviewed Jul 12, 2024

View reviewed changes

onediff_diffusers_extensions/examples/kolors/README.md Show resolved Hide resolved

strint reviewed Jul 12, 2024

View reviewed changes

onediff_diffusers_extensions/examples/kolors/README.md Show resolved Hide resolved

Merge branch 'main' into Add_kolors_compile

d3ed12f

clackhan closed this Jul 16, 2024

lixiang007666 reopened this Jul 16, 2024

Support dynamic shape

76bd91f

strint and others added 2 commits July 18, 2024 15:03

Merge branch 'main' into Add_kolors_compile

4c072e7

Fix scripr

27f5c15

lixiang007666 and others added 5 commits July 19, 2024 16:55

Add a100 40g data

2e41a8b

Add A100-SXM4-80GB test

06dfcb6

Refine readme

c92cb2d

Delete imgs/kolors_demo.png

81ce823

Merge branch 'main' into Add_kolors_compile

290db0e

strint reviewed Jul 23, 2024

View reviewed changes

onediff_diffusers_extensions/examples/kolors/README.md Outdated Show resolved Hide resolved

lixiang007666 added 2 commits July 23, 2024 15:36

Refine readme

891ddb8

Refine readme

14f4905

strint reviewed Jul 23, 2024

View reviewed changes

onediff_diffusers_extensions/examples/kolors/text_to_image_kolors.py Show resolved Hide resolved

strint reviewed Jul 23, 2024

View reviewed changes

onediff_diffusers_extensions/examples/kolors/README.md Outdated Show resolved Hide resolved

Refine cache time

ad350bb

lixiang007666 and others added 2 commits July 23, 2024 16:51

Refine

8ea988b

Merge branch 'main' into Add_kolors_compile

2c25780

strint reviewed Jul 23, 2024

View reviewed changes

onediff_diffusers_extensions/examples/kolors/README.md Outdated Show resolved Hide resolved

Update README.md

bbe44b3

strint approved these changes Jul 23, 2024

View reviewed changes

lixiang007666 merged commit a489df5 into main Jul 23, 2024
7 checks passed

lixiang007666 deleted the Add_kolors_compile branch July 23, 2024 12:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add kolors compile #1007

Add kolors compile #1007

lixiang007666 commented Jul 11, 2024 •

edited

Loading

lixiang007666 commented Jul 11, 2024

lixiang007666 commented Jul 16, 2024

lixiang007666 commented Jul 18, 2024

lixiang007666 commented Jul 19, 2024 •

edited

Loading

lixiang007666 commented Jul 19, 2024 •

edited

Loading

strint commented Jul 19, 2024 •

edited

Loading

Add kolors compile #1007

Add kolors compile #1007

Conversation

lixiang007666 commented Jul 11, 2024 • edited Loading

lixiang007666 commented Jul 11, 2024

lixiang007666 commented Jul 16, 2024

lixiang007666 commented Jul 18, 2024

lixiang007666 commented Jul 19, 2024 • edited Loading

lixiang007666 commented Jul 19, 2024 • edited Loading

strint commented Jul 19, 2024 • edited Loading

lixiang007666 commented Jul 11, 2024 •

edited

Loading

lixiang007666 commented Jul 19, 2024 •

edited

Loading

lixiang007666 commented Jul 19, 2024 •

edited

Loading

strint commented Jul 19, 2024 •

edited

Loading