用于单帧图像超分辨重建的自监督图像扩散模型

Self-Supervised Diffusion Model for Single-Image Super-Resolution Reconstruction

  • 摘要: 基于深度学习的方法在图像超分辨重建任务中已经取得了显著突破。它们成功的关键在于依赖大量成对的低分辨率和高分辨率图像来训练超分辨模型。然而,众所周知,获取如此大量一一对应的真实高-低分辨率图像对是一个具有挑战性的任务。且基于仿真图像对训练的模型在面对具有与训练集退化类型不同的图像时往往表现不佳。在本文中,我们提出了用于单帧图像超分辨重建的自监督图像扩散模型(Self-supervised Diffusion Model for Single Image Super-resolution,SSDM-SR)来突破数据集的限制,从而避免这些问题。该方法基于扩散模型来学习单帧图像内的信息分布,并为待超分辨重建的图像训练一个小型的特定图像扩散模型。训练数据集仅从待超分辨图像本身中提取,因此SSDM-SR可以适应不同的输入图像。另外,该方法引入了坐标信息以帮助构建出图像的整体框架,从而使模型收敛更快。在多个公开基准数据集和具有未知退化核的数据集上的实验表明,SSDM-SR不仅在图像失真度方面优于近期先进的有监督和无监督图像超分辨重建方法,并且能生成具有更高感知质量的图像。在真实世界低分辨率图像上,它也生成了视觉上令人满意且无明显伪影的结果。

     

    Abstract: Deep-learning-based methods have enabled significant breakthroughs in image super-resolution tasks. The key to their success is the requirement for significant amounts of paired low- and high-resolution images to train the super-resolution model. However, obtaining such significant numbers of one-to-one corresponding high/low-resolution real image pairs is challenging. Models trained on those synthetic image pairs tend to exhibit subpar performance when images with unexpected degradation are involved. This paper presents an approach for solving these problems by training a self-supervised diffusion model on a single image (SSDM-SR). The proposed method learns the information distribution inside a single image based on the diffusion model and trains a small image-specific diffusion model for the image to be super-resolved. The training datasets are extracted solely from the image to be super-resolved such that the SSDM-SR can adapt to different input images. Additionally, coordinate information is incorporated to facilitate the construction of the overall image framework, which accelerates the model’s convergence. Experiments on standard benchmark datasets and datasets with unknown degradation kernels show that our SSDM-SR outperforms recent supervised and unsupervised image super-resolution methods in terms of image-restoration metrics and generates images with higher perceptual quality. On real-world LR images, it generates visually pleasing and artifact-free results.

     

/

返回文章
返回