Skip to content

Latest commit

 

History

History
227 lines (122 loc) · 18.2 KB

README.md

File metadata and controls

227 lines (122 loc) · 18.2 KB

SISR-Survey

An investigation project for SISR. [Paper]

This repository is an official project of the paper "From Beginner to Master: A Survey for Deep Learning-based Single-Image Super-Resolution".

Purpose

Due to the pages and time limitation, it is impossible to introduce all SISR methods in the paper, and it is impossible to update the latest methods in time. Therefore, we use this project to assist our survey to cover more methods. This will be a continuously updated project! We hope it can help more researchers and promote the development of image super-resolution. Welcome more researchers to jointly maintain this project!

Abstract

Single-image super-resolution (SISR) is an important task in image processing, which aims to enhance the resolution of imaging systems. Recently, SISR has made a huge leap and has achieved promising results with the help of deep learning (DL). In this survey, we give an overview of DL-based SISR methods and group them according to their targets, such as reconstruction efficiency, reconstruction accuracy, and perceptual accuracy. Specifically, we first introduce the problem definition, research background, and the significance of SISR. Secondly, we introduce some related works, including benchmark datasets, upsampling methods, optimization objectives, and image quality assessment methods. Thirdly, we provide a detailed investigation of SISR and give some domain-specific applications of it. Fourthly, we present the reconstruction results of some classic SISR methods to intuitively know their performance. Finally, we discuss some issues that still exist in SISR and summarize some new trends and future directions. This is an exhaustive survey of SISR, which can help researchers better understand SISR and inspire more exciting research in this field.

Taxonomy

Datasets

Benchmarks datasets for single-image super-resolution (SISR).

# SINGLE-IMAGE SUPER-RESOLUTION

Reconstruction Efficiency Methods

Perceptual Quality Methods

Perceptual Quality Methods

Further Improvement Methods

DOMAIN-SPECIFIC APPLICATIONS

Real-World SISR

The degradation modes are complex and unknown in real-world scenarios, where downsampling is usually performed after anisotropic blurring and sometimes signal-dependent noise is added. Recently, some new technologies have been proposed, such as unsupervised learning, self-supervised learning, zero-shot learning, meta-learning, blind SISR, and scale arbitrary SISR. In this part, we introduce the latter three methods due to their impressive foresight and versatility.

Blind SISR

[1] Learning A Single Convolutional Super-Resolution Network for Multiple Degradations

[2] Deep Plug-and-Play Super-Resolution for Arbitrary Blur Kernels

[3] Unified Dynamic Convolutional Network for Super-Resolution with Variational Degradations

[4] Learning the Non-Differentiable Optimization for Blind Super-Resolution

[5] Deep Unfolding Network for Image Super-Resolution

[6] Blind Super-Resolution with Iterative Kernel Correction

[7] Unfolding the Alternating Optimization for Blind Super Resolution

[8] KOALAnet: Blind Super-Resolution using Kernel-Oriented Adaptive Local Adjustment

[9] KernelNet: A Blind Super-Resolution Kernel Estimation Network

[10] Unsupervised Image Super-Resolution Using Cycle-in-Cycle Generative Adversarial Networks

Meta-Learning

[1] Meta-Transfer Learning for Zero-Shot Super-Resolution

[2] Fast Adaptation to Super-Resolution Networks via Meta-Learning

[3] Meta-USR: A Unified Super-Resolution Network for Multiple Degradation Parameters

Scale Arbitrary SISR

[1] Meta-SR: A Magnification-Arbitrary Network for Super-Resolution

[2] Meta-USR: A Unified Super-Resolution Network for Multiple Degradation Parameters

[3] Learning A Single Network for Scale-Arbitrary Super-Resolution}

Remote Sensing Image Super-Resolution

With the development of satellite image processing, remote sensing has become more and more important. However, due to the limitations of current imaging sensors and complex atmospheric conditions, such as limited spatial resolution, spectral resolution, and radiation resolution, we are facing huge challenges in remote sensing applications.

[1] A New Deep Generative Network for Unsupervised Remote Sensing Single-Image Super-Resolution

[2] Deep Residual Squeeze and Excitation Network for Remote Sensing Image Super-Resolution

[3] Remote Sensing Image Super-Resolution via Mixed High-order Attention Network

[4] Remote Sensing Image Super-Resolution Using Second-Order Multi-Scale Networks

Hyperspectral Image Super-Resolution

In contrast to human eyes that can only be exposed to visible light, hyperspectral imaging is a technique for collecting and processing information across the entire range of electromagnetic spectrum. The hyperspectral system is often compromised due to the limitations of the amount of the incident energy, hence there is a trade-off between the spatial and spectral resolution. Therefore, hyperspectral image super-resolution is studied to solve this problem.

[1] Hyperspectral Image Spatial Super-Resolution via 3D Full Convolutional Neural Network

[2] Single Hyperspectral Image Super-Resolution with Grouped Deep Recursive Residual Network

[3] Hyperspectral Image Super-Resolution with Optimized RGB Guidance

[4] Learning Spatial-Spectral Prior for Super-Resolution of Hyperspectral Imagery

[5] A Spectral Grouping and Attention-Driven Residual Dense Network for Hyperspectral Image Super-Resolution

Light Field Image Super-Resolution

Light field (LF) camera is a camera that can capture information about the light field emanating from a scene and can provide multiple views of a scene. Recently, the LF image is becoming more and more important since it can be used for post-capture refocusing, depth sensing, and de-occlusion. However, LF cameras are faced with a trade-off between spatial and angular resolution. In order to solve this issue, SR technology is introduced to achieve a good balance between spatial and angular resolution.

[1] Light-field Image Super-Resolution Using Convolutional Neural Network

[2] LFNet: A novel Bidirectional Recurrent Convolutional Neural Network for Light-field Image Super-Resolution

[3] Spatial-Angular Interaction for Light Field Image Super-Resolution

[4] Light Field Image Super-Resolution Using Deformable Convolution

Face Image Super-Resolution

Face image super-resolution is the most famous field in which apply SR technology to domain-specific images. Due to the potential applications in facial recognition systems such as security and surveillance, face image super-resolution has become an active area of research.

[1] Learning Face Hallucination in the Wild

[2] Deep Cascaded Bi-Network for Face Hallucination

[3] Hallucinating Very Low-Resolution Unaligned and Noisy Face Images by Transformative Discriminative Autoencoders

[4] Super-Identity Convolutional Neural Network for Face Hallucination

[5] Exemplar Guided Face Image Super-Resolution without Facial Landmarks

[6] Robust Facial Image Super-Resolution by Kernel Locality-Constrained Coupled-Layer Regression

Medical Image Super-Resolution

Medical imaging methods such as computational tomography (CT) and magnetic resonance imaging (MRI) are essential to clinical diagnoses and surgery planning. Hence, high-resolution medical images are desirable to provide necessary visual information of the human body. Recently, many methods have been proposed for medical image super-resolution

[1] Efficient and Accurate MRI Super-Resolution Using A Generative Adversarial Network and 3D Multi-Level Densely Connected Network

[2] CT-Image of Rock Samples Super Resolution Using 3D Convolutional Neural Network

[3] Channel Splitting Network for Single MR Image Super-Resolution

[4] SAINT: Spatially Aware Interpolation Network for Medical Slice Synthesis

Depth Map Super-Resolution

The depth map is an image or image channel that contains information relating to the distance of the surfaces of scene objects from a viewpoint. The use of depth information of a scene is essential in many applications such as autonomous navigation, 3D reconstruction, human-computer interaction, and virtual reality. However, depth sensors, such as Microsoft Kinect and Lidar, can only provide depth maps of limited resolutions. Hence, depth map super-resolution has drawn more and more attention recently.

[1] Deep Depth Super-Resolution: Learning Depth Super-Resolution Using Deep Convolutional Neural Network

[2] Atgv-net: Accurate Depth Super-Resolution

[3] Depth Map Super-Resolution by Deep Multi-Scale Guidance

[4] Deeply Supervised Depth Map Super-Resolution as Novel View Synthesis

[5] Perceptual Deep Depth Super-Resolution

[6] Channel Attention based Iterative Residual Kearning for Depth Map Super-Resolution

Stereo Image Super-Resolution

The dual camera has been widely used to estimate depth information. Meanwhile, stereo imaging can also be applied in image restoration. In the stereo image pair, we have two images with disparity much larger than one pixel. Therefore, full use of these two images can enhance the spatial resolution.

[1] Enhancing the Spatial Resolution of Stereo Images Using A Parallax Prior

[2] Learning Parallax Attention for Stereo Image Super-Resolution

[3] Parallax Attention for Unsupervised Stereo Correspondence Learning

[4] Flickr1024: A Large-Scale Dataset for Stereo Image Super-Resolution

[5] A Stereo Attention Module for Stereo Image Super-Resolution

[6] Symmetric Parallax Attention for Stereo Image Super-Resolution

[7] Deep Bilateral Learning for Stereo Image Super-Resolution

[8] Stereoscopic Image Super-Resolution with Stereo Consistent Feature

[9] Feedback Network for Mutually Boosted Stereo Image Super-Resolution and Disparity Estimation

Video Super-Resolution

As an emerging medium, video has attracted increasing attention owing to its ability to carry more information. Specifically, the video consists of multiple images, and each frame is an image, so it can provide more scene information. However, it is difficult to obtain high-resolution video due to the limitations of the network transmission and device storage. Therefore, video super-resolution (VSR) technology is essential. For VSR, multiple frames provide much more scene information, thus full use of the inter-frame temporal dependency (e.g., motions, brightness, color changes) is beneficial for high-quality video reconstruction.

[1] Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation

[2] Robust Video Super-Resolution with Learned Temporal Dynamics

[3] TDAN: Temporally-Deformable Alignment Network for Video Super-Resolution

[4] Video Super-Resolution with Recurrent Structure-Detail Network

[5] Video Super-Resolution Transformer

RECONSTRUCTION RESULTS

PSNR/SSIM comparison of lightweight SISR models (the number of model parameters less than 1000K) on Set5 (x4), Set14 (x4), and Urban100 (x4). Meanwhile, the training datasets and the number of model parameters are provided. Sort by PSNR of Set5 in ascending order. Best results are highlighted.

PSNR/SSIM comparison of large SISR models (the number of model parameters more than 1M, M=million) on Set5 (x4), Set14 (x4), and Urban100 (x4). Meanwhile, the training datasets and the number of model parameters are provided. Sort by PSNR of Set5 in ascending order. Best results are highlighted.

@article{li2021beginner,
  title={From Beginner to Master: A Survey for Deep Learning-based Single-Image Super-Resolution},
  author={Li, Juncheng and Pei, Zehua and Zeng, Tieyong},
  journal={arXiv preprint arXiv:2109.14335},
  year={2021}
}