Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DiT results on CIFAR10 #84

Open
yuanzhi-zhu opened this issue May 4, 2024 · 5 comments
Open

DiT results on CIFAR10 #84

yuanzhi-zhu opened this issue May 4, 2024 · 5 comments

Comments

@yuanzhi-zhu
Copy link

have you tried to run DiT on CIFAR10 dataset?
I did some simple expr and found that DiT does not work well on CIFAR10.

@tanghengjian
Copy link

i also found sample.py script always give same result image under same label.
In the workflow of DiTBlock, i wonder there is no cross attention , so i guess the variation ability may be a challange to DiT?

@yuanzhi-zhu
Copy link
Author

i also found sample.py script always give same result image under same label.
In the workflow of DiTBlock, i wonder there is no cross attention , so i guess the variation ability may be a challange to DiT?

Hi @tanghengjian, I do not know if your question is related to the cifar expr, but did you change the seed in the sample.py script?

DiT/sample.py

Line 23 in ed81ce2

torch.manual_seed(args.seed)

@tanghengjian
Copy link

tanghengjian commented May 7, 2024

run with default value.
by the way, i found cifar10 dataset is only 32*32 pixel with 10 classes, it means the y condition changes from 0 to 9.
do you have tested the mscoco dataset in DiT model with label condition?

@zhengyu-su

This comment has been minimized.

@forever208
Copy link

forever208 commented Feb 10, 2025

I trained the model on CIFAR-10, the FID is very high, about FID-50k = 10.

Regarding the results on CelebA 64x64 and FFHQ 128x128, refer to the paper:
https://arxiv.org/abs/2412.15032

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants