Implicit Generative Models Evaluation

Qualitative Evaluation

Nearest Neighbors

Real samples from training set are displayed next to their nearest neighbors in the achievable generation space.

Cons :

Typically computed with Euclidean distance which is very sensitive to minor perceptual perturbations
Overfitting to training set makes it trivial to pass this test

"Turing-like" tests

Measure ability to fool subjects with generated samples

Cons :

Cumbersome, expensive, experimental hazards causing inconsistent evaluation settings between subjects
Fails to evaluate diversity --> Overfitting models pass this test too

Visualizing Internals of the Model

Visualize representation disentanglement, space continuity, discriminator features and globally any facet of the model's regularity.

Image Quality Assessment Metrics

Image quality assessment provides a measure of the quality of an image in reference to the original image or not. We here review some metrics that have been used in works on generative methods for remote sensing (Wang et al. 2019, Grohnfeldt et al. 2018)

PSNR (Peak Signal to Noise Ratio)

Compares the power of a clean image y to the power of corrupting noise from its corrupted version x as :

Pros :

Cons : High sensitivity towards biases in brightness

SAM (Spectral Angle Mapper, Boardman et al. 1993)

Estimates spectra similarity by comparing band similarities.

Given a pair of NxNxd images x and y, we have :

Variations :

Kernel-SAM : use kernel trick on base SAM expression

Pros :

Cons :

SSIM (Structural Similarity Index, Wang et al. 2004)

Estimates structural disparities based on luminosity, constrast and structure for a pair of image windows x and y as :

see here for luminosity, contrast and structure expressions

Pros : Finds large-scale mode collapse reliably

Cons : Fails to diagnose smaller effects such as loss of variations in colors and textures + does not assess quality in terms of similarity to the dataset

Variations :

ESSIM: adds edge information
MS-SSIM: multi-scale comparison
FSIM: compares phase congruency and gradient magnitude
CW-SSIM: compares complex wavelet transform (deals with issues of image scaling, translation and rotation)

Sharpness Difference (SD)

Pretty self-explanatory ? 😄

where

Pros :

Cons :

Table of IQA metrics

1			Metric	Comment	Ref	Implementation
2	Full-Reference	Error/Distortion-based	Mean Absolute Error	-	-	np.abs(x - y).mean()
3			Mean Squared Error	-	-	np.square(x - y).mean()
4			PSNR	-	-	https://scikit-image.org/docs/dev/api/skimage.metrics.html?highlight=psnr#skimage.metrics.peak_signal_noise_ratio
5			SVD-distortion	averages stretcher deviation by block	https://ieeexplore.ieee.org/document/1576815	None found
6			Distortion Measure	didn't understand this one quite well	https://ieeexplore.ieee.org/document/841940	None found
7		Similarity-based	Structural Content	Ratio of squares sum	-	np.mean(y2/x2)
8			Mutual Information	-	-	https://stackoverflow.com/questions/20491028/optimal-way-to-compute-pairwise-mutual-information-using-numpy
9			Cross-Correlation	-	-	https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.signal.correlate2d.html
10			Spectral Angle Mapper	easy to implement	https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=5&cad=rja&uact=8&ved=2ahUKEwiOw9LBuu3oAhVC8uAKHXJJDjoQFjAEegQIAhAB&url=https%3A%2F%2Faviris.jpl.nasa.gov%2Fproceedings%2Fworkshops%2F92_docs%2F52.PDF&usg=AOvVaw1Yv0yDi3Zy9MCS5MvU3sqP	None found
11			Universal Index	lesser version of SSIM	https://ieeexplore.ieee.org/document/995823
12			Structural Similarity Index (SSIM)	structure x luminosity x constrast	https://ieeexplore.ieee.org/document/1284395	https://scikit-image.org/docs/dev/api/skimage.metrics.html?highlight=psnr#skimage.metrics.structural_similarity
13			Mutliscale-SSIM	same but multiple image scales	https://ece.uwaterloo.ca/~z70wang/publications/msssim.html	https://github.com/tensorflow/tensorflow/blob/v2.2.0/tensorflow/python/ops/image_ops_impl.py#L3647-L3771
14			Features-SSIM	phase congruency and gradient magnitude	https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=13&ved=2ahUKEwis1K-xt8LpAhVSOBoKHUlPCtcQFjAMegQIARAB&url=https%3A%2F%2Fwww4.comp.polyu.edu.hk%2F~cslzhang%2FIQA%2FTIP_IQA_FSIM.pdf&usg=AOvVaw0MYYeuaCARGosutraAETZf	None found
15			Complex-Wavelett-SSIM	handles scaling, translation and rotations	https://ieeexplore.ieee.org/document/6115659	https://github.com/jterrace/pyssim/blob/master/ssim/ssimlib.py
16	No-Reference		BRISQUE	estimates asymmetric generalized Gaussian params on MSCN distribution - requires training	https://ieeexplore.ieee.org/document/6272356	https://towardsdatascience.com/automatic-image-quality-assessment-in-python-391a6be52c11
17			GMLOGQA	gradient magnitude and laplacian of gaussian response - required training	https://ieeexplore.ieee.org/document/6894197
18			ILNIQE	estimates Weibull params fitting gradient magnitude - requires training	https://ieeexplore.ieee.org/abstract/document/7094273	https://github.com/dsoellinger/blind_image_quality_toolbox/tree/master/%2Bniqe
19			SSEQ	spatial and spectral entropy features - requires training	https://www.sciencedirect.com/science/article/abs/pii/S0923596514000927
20			ENIQA	improved SSEQ with multiscales, Log-Gabor and bandwise approach - requires training	https://link.springer.com/article/10.1186/s13640-019-0479-7

Probabilistic Measures

to be completed but, as of now, not a priority in the context of virtual remote sensing product generation as we have access to the generation groundtruth and would rather focus on evaluation procedures based on comparison to groundtruth

References

Pros and Cons of GAN Evaluation Measures, Borji 2018 : Comprehensive overview on GANs evaluation measures
A note on the evaluation of generative models, Theis et al. 2015 : provides good explanations on why some measures are inconsistent with each other
More to come

This is a footer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implicit Generative Models Evaluation

Qualitative Evaluation

Nearest Neighbors

"Turing-like" tests

Visualizing Internals of the Model

Image Quality Assessment Metrics

PSNR (Peak Signal to Noise Ratio)

SAM (Spectral Angle Mapper, Boardman et al. 1993)

SSIM (Structural Similarity Index, Wang et al. 2004)

Sharpness Difference (SD)

Table of IQA metrics

Probabilistic Measures

References

Clone this wiki locally