Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Would it be possible to choose Upsample and Downsample strategies? #13

Open
shimopino opened this issue May 27, 2020 · 2 comments
Open
Labels
enhancement New feature or request

Comments

@shimopino
Copy link

Looking at the official BigGAN implementation in Tensorflow, I found they use ConvTranspose2d for Upsample and Conv2d for Downsample in the ResNet block (e.g. https://github.com/taki0112/BigGAN-Tensorflow/blob/master/ops.py#L159).

I know that BigGAN implementations in PyTorch use a combination of Pooling and Conv (e.q. https://github.com/ajbrock/BigGAN-PyTorch/blob/master/BigGAN.py#L341), but in my experience, I can't say for sure which is the better performance.

In the future, is it possible to flexibly select an operation to change the resolution of the input feature map?

@kwotsin
Copy link
Owner

kwotsin commented May 27, 2020

@KeisukeShimokawa This is an interesting suggestion: indeed, I'm also not sure what the performance difference between the two is. I did a deeper check and it seems like the TensorFlow version for BigGAN also uses a combination of Pool + Conv: https://github.com/google/compare_gan/blob/master/compare_gan/architectures/resnet_ops.py#L131
Perhaps this resblock structure is unique to BigGAN rather than the version from Miyato et al, but I might be wrong. Nonetheless, I think this is a very good point (and detail) to note and will certainly keep your suggestion in mind!

@kwotsin kwotsin added the enhancement New feature or request label May 27, 2020
@shimopino
Copy link
Author

@kwotsin Thank you for your reply. I hadn't checked that repository. Thank you for sharing.

Reading the original BigGAN paper again (arxiv), I found that the following diagram was provided and that a combination of Pooling and Conv was employed.

image

I also explored nvidia's repository on SPADE and found that it uses a combination of Pooling and Conv for ResBlock as well (e.g. https://github.com/NVlabs/SPADE/blob/master/models/networks/discriminator.py#L46).

The official implementation of Tensorflow that I have shown as a reference may have been a bit of a special implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants