You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@KeisukeShimokawa This is an interesting suggestion: indeed, I'm also not sure what the performance difference between the two is. I did a deeper check and it seems like the TensorFlow version for BigGAN also uses a combination of Pool + Conv: https://github.com/google/compare_gan/blob/master/compare_gan/architectures/resnet_ops.py#L131
Perhaps this resblock structure is unique to BigGAN rather than the version from Miyato et al, but I might be wrong. Nonetheless, I think this is a very good point (and detail) to note and will certainly keep your suggestion in mind!
@kwotsin Thank you for your reply. I hadn't checked that repository. Thank you for sharing.
Reading the original BigGAN paper again (arxiv), I found that the following diagram was provided and that a combination of Pooling and Conv was employed.
Looking at the official BigGAN implementation in Tensorflow, I found they use ConvTranspose2d for Upsample and Conv2d for Downsample in the ResNet block (e.g. https://github.com/taki0112/BigGAN-Tensorflow/blob/master/ops.py#L159).
I know that BigGAN implementations in PyTorch use a combination of Pooling and Conv (e.q. https://github.com/ajbrock/BigGAN-PyTorch/blob/master/BigGAN.py#L341), but in my experience, I can't say for sure which is the better performance.
In the future, is it possible to flexibly select an operation to change the resolution of the input feature map?
The text was updated successfully, but these errors were encountered: