Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Init porting VQ-diffusion to diffusers
This initial commit ports the VQ-diffusion VQVAE for the ITHQ dataset to diffusers. Add `convert_vq_diffusion_to_diffusers.py` script: This script initially only converts the VQVAE to diffusers. It will be updated to convert the whole model. Add placeholder `VQDiffusionPipeline`: The `VQDiffusionPipeline` is added as a placeholder to wrap the vqvae so it can be used in the `convert_vq_diffusion_to_diffusers.py` script to save the ported model. Add `ConvAttentionBlock`: The VQVAE used for ITHQ in VQ-diffusion uses a slightly different attention block than the one already in diffusers. The `ConvAttentionBlock` uses `torch.nn.Conv2d`'s for its linear layers as opposed to `torch.nn.Linear`'s. There are a few other minor discrepancies between the two attention blocks. Add specify dimmension of embeddings to VQModel: `VQModel` will by default set the dimension of embeddings to the number of latent channels. The VQ-diffusion VQVAE for ITHQ has a smaller embedding dimension, 128, than number of latent channels, 256.
- Loading branch information