The course is devoted to modern generative models (mostly in the application to computer vision).
We will study the following types of generative models:
- autoregressive models,
- latent variable models,
- normalization flow models,
- adversarial models,
- diffusion and score models.
Special attention is paid to the properties of various classes of generative models, their interrelationships, theoretical prerequisites and methods of quality assessment.
The aim of the course is to introduce the student to widely used advanced methods of deep learning.
The course is accompanied by practical tasks that allow you to understand the principles of the considered models.
- telegram: @roman_isachenko
- e-mail: roman.isachenko@phystech.edu
# | Date | Description | Slides |
---|---|---|---|
1 | September, 10 | Lecture 1: Logistics. Generative models overview and motivation. Problem statement. Divergence minimization framework. Autoregressive models (PixelCNN). | slides |
Seminar 1: Introduction. Maximum likelihood estimation. Histograms. Bayes theorem. PixelCNN | slides | ||
2 | September, 17 | Lecture 2: Normalizing Flow (NF) intuition and definition. Linear NF. Gaussian autoregressive NF. Coupling layer (RealNVP). | slides |
Seminar 2: Planar and Radial Flows. Forward vs Reverse KL. | slides | ||
3 | September, 24 | Lecture 3: Forward and reverse KL divergence for NF. Latent variable models (LVM). Variational lower bound (ELBO). EM-algorithm. | slides |
Seminar 3: Forward vs Reverse KL. RealNVP | slides | ||
4 | October, 1 | Lecture 4: Amortized inference, ELBO gradients, reparametrization trick. Variational Autoencoder (VAE). NF as VAE model. Discrete VAE latent representations. | slides |
Seminar 4: Gaussian Mixture Model (GMM). GMM and MLE. ELBO and EM-algorithm. GMM via EM-algorithm. Variational EM algorithm for GMM. | slides | ||
5 | October, 8 | Lecture 5: Vector quantization, straight-through gradient estimation (VQ-VAE). Gumbel-softmax trick (DALL-E). ELBO surgery and optimal VAE prior. Learnable VAE prior. | slides |
Seminar 5: VAE: Implementation hints. Vanilla 2D VAE coding. VAE on Binarized MNIST visualization. Posterior collapse. Beta VAE on MNIST. | slides | ||
6 | October, 15 | Lecture 6: Likelihood-free learning. GAN optimality theorem. Wasserstein distance. | slides |
Seminar 6: KL vs JS divergences. Vanilla GAN in 1D coding. Mode collapse and vanishing gradients. Non-saturating GAN. | slides | ||
7 | October, 22 | Lecture 7: Wasserstein GAN (WGAN). f-divergence minimization. GAN evaluation (FID, Precision-Recall, truncation trick). | slides |
Seminar 7: WGAN and WGAN-GP. | slides | ||
8 | October, 29 | Lecture 8: Langevin dynamic. Score matching (Denoising score matching, Noise Conditioned Score Network (NCSN)). Forward gaussian diffusion process. | slides |
Seminar 8: StyleGAN. | slides | ||
9 | November, 5 | Lecture 9: Denoising score matching for diffusion. Reverse Gaussian diffusion process. Gaussian diffusion model as VAE. ELBO for DDPM. | slides |
Seminar 9: Noise Conditioned Score Network (NCSN). | slides | ||
10 | November, 12 | Lecture 10: Denoising diffusion probabilistic model (DDPM): reparametrization and overview. Denoising diffusion as score-based generative model. Model guidance: classifier guidance, classfier-free guidance. | slides |
Seminar 10: Denoising diffusion probabilistic model (DDPM). Denoising Diffusion Implicit Models (DDIM). | slides | ||
11 | November, 19 | Lecture 11: Continuous-in-time NF and neural ODE. Continuity equation for NF log-likelihood. FFJORD and Hutchinson's trace estimator. Adjoint method for continuous-in-time NF. | slides |
Seminar 11: Guidance. CLIP, GLIDE, DALL-E 2, Imagen, Latent Diffusion Model. | slides | ||
12 | November, 26 | Lecture 12: SDE basics. Kolmogorov-Fokker-Planck equation. Probability flow ODE. Reverse SDE. Variance Preserving and Variance Exploding SDEs. | slides |
Seminar 12: Latent Diffusion Models. Recap and colab playground. | slides |
||
13 | December, 3 | Lecture 13: Score-based generative models through SDE. Flow matching. Conditional flow matching. Conical gaussian paths. | slides |
Seminar 13: Latent Diffusion Models. Code. | slides |
||
13 | December, 10 | Lecture 14: Conical gaussian paths (continued). Linear interpolation. Link with diffusion and score matching. Latent space models. Course overview. | slides |
Seminar 14: The Final Recap | slides |
- 6 homeworks each of 13 points = 78 points
- oral cozy exam = 26 points
- maximum points: 78 + 26 = 104 points
- probability theory + statistics
- machine learning + basics of deep learning
- python + pytorch