Skip to content

Commit

Permalink
update cogvideo reports
Browse files Browse the repository at this point in the history
  • Loading branch information
feifeibear committed Oct 14, 2024
1 parent 1fe2439 commit 33dd59d
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 0 deletions.
2 changes: 2 additions & 0 deletions docs/performance/cogvideo.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
## CogVideo Performance
[Chinese Version](./cogvideo_zh.md)

Details on how to apply xDiT to CogVideoX: [Leveraging xDiT to Parallelize the Open-Sourced Video Generation Model CogVideoX](https://medium.com/@xditproject/boosting-aigc-inference-leveraging-xdit-to-parallelize-the-cogvideox-text-to-video-workflow-8128e45b36e9)

CogVideo functions as a text-to-video model. xDiT presently integrates USP techniques (including Ulysses attention and Ring attention) and CFG parallelism to enhance inference speed, while work on PipeFusion is ongoing. Due to constraints in video generation dimensions in CogVideo, the maximum parallelism level for USP is 2. Thus, xDiT can leverage up to 4 GPUs to execute CogVideo, despite the potential for additional GPUs within the machine.

In a system equipped with L40 (PCIe) GPUs, we compared the inference performance of single-GPU CogVideoX utilizing the `diffusers` library with our parallelized versions for generating 49-frame (6-second) 720x480 videos.
Expand Down
2 changes: 2 additions & 0 deletions docs/performance/cogvideo_zh.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
## CogVideo 性能表现

使用xDiT在CogVideo中的细节: [利用xDiT多GPU并行执行CogVideoX文生视频流程](https://medium.com/@xditproject/aigc%E6%8E%A8%E7%90%86%E5%8A%A0%E9%80%9F-%E5%88%A9%E7%94%A8xdit%E5%B9%B6%E8%A1%8Ccogvideox%E6%96%87%E7%94%9F%E8%A7%86%E9%A2%91%E6%B5%81%E7%A8%8B-86255f9979a9)

CogVideo 是一个文本到视频的模型。xDiT 目前整合了 USP 技术(包括 Ulysses 注意力和 Ring 注意力)和 CFG 并行来提高推理速度,同时 PipeFusion 的工作正在进行中。由于 CogVideo 在视频生成尺寸上的限制,USP 的最大并行级别为 2。因此,xDiT 可以利用最多 4 个 GPU 来执行 CogVideo,尽管机器内可能有更多的 GPU。

在配备 L40(PCIe)GPU 的计算平台上,我们对基于 `diffusers` 库的单 GPU CogVideoX 推理与我们提出的并行化版本在生成 49帧(6秒)720x480 分辨率视频时的性能差异进行了深入分析。
Expand Down

0 comments on commit 33dd59d

Please sign in to comment.