ZERO-SHOT IMAGE RESTORATION USING DENOISING DIFFUSION NULL-SPACE MODEL

Abstract

Most existing Image Restoration (IR) models are task-specific, which can not be generalized to different degradation operators. In this work, we propose the Denoising Diffusion Null-Space Model (DDNM), a novel zero-shot framework for arbitrary linear IR problems, including but not limited to image super-resolution, colorization, inpainting, compressed sensing, and deblurring. DDNM only needs a pre-trained off-the-shelf diffusion model as the generative prior, without any extra training or network modifications. By refining only the null-space contents during the reverse diffusion process, we can yield diverse results satisfying both data consistency and realness. We further propose an enhanced and robust version, dubbed DDNM + , to support noisy restoration and improve restoration quality for hard tasks. Our experiments on several IR tasks reveal that DDNM outperforms other state-of-the-art zero-shot IR methods. We also demonstrate that DDNM + can solve complex real-world applications, e.g., old photo restoration.

1. INTRODUCTION

Image Restoration (IR) is a long-standing problem due to its extensive application value and its illposed nature (Richardson, 1972; Andrews & Hunt, 1977) . IR aims at yielding a high-quality image x from a degraded observation y = Ax + n, where x stands for the original image and n represents a non-linear noise. A is a known linear operator, which may be a bicubic downsampler in image super-resolution, a sampling matrix in compressed sensing, or even a composite type. Traditional IR methods are typically model-based, whose solution can be usually formulated as: x = arg min x 2σ 2 ||Ax -y|| 2 + λR(x). The first term 1 2σ 2 ||Ax-y|| 2 2 optimizes the result toward data consistency while the second image-prior term λR(x) regularizes the result with formulaic prior knowledge on natural image distribution, e.g., sparsity and Tikhonov regularization. Though the hand-designed prior knowledge may prevent some artifacts, they often fail to bring realistic details. The prevailing of deep neural networks (DNN) brings new patterns of solving IR tasks (Dong et al., 2015) , which typically train an end-to-end DNN D θ by optimizing network parameters θ following arg min θ N i=1 ||D θ (y i ) -x i || 2 2 , where N pairs of degraded image y i and ground truth image x i are needed to learn the mapping from y to x directly. Although end-to-end learning-based IR methods avoid explicitly modeling the degradation A and the prior term in Eq. 1 and are fast during inference, they usually lack interpretation. Some efforts have been made in exploring interpretable DNN structures (Zhang & Ghanem, 2018; Zhang et al., 2020) , however, they still yield poor performance when facing domain shift since Eq. 2 essentially encourage learning the mapping from y i to x i . For the same reason, the end-to-end learning-based IR methods usually need to train a dedicated DNN for each specific task, lacking generalizability and flexibility in solving diverse IR tasks. The evolution of generative models (Goodfellow et al., 2014; Bahat & Michaeli, 2014; Van Den Oord et al., 2017; Karras et al., 2019; 2020; 2021) further pushes the end-to-end learning-based IR methods toward unprecedented performance in yielding realistic results (Yang et al., 2021; Wang et al., 2021; Chan et al., 2021; Wang et al., 2022) . At the same time, some methods (Menon et al., 2020; Pan et al., 2021) start to leverage the latent space of pretrained generative models to solve IR problems in a zero-shot way. Typically, they optimize the following objective: arg min w 1 2σ 2 ||AG(w) -y|| 2 2 + λR(w), where G is the pretrained generative model, w is the latent code, G(w) is the corresponding generative result and R(w) constrains w to its original distribution space, e.g., a Gaussian distribution. However, this type of method often struggles to balance realness and data consistency. The Range-Null space decomposition (Schwab et al., 2019; Wang et al., 2023) offers a new perspective on the relationship between realness and data consistency: the data consistency is only related to the range-space contents, which can be analytically calculated. Hence the data term can be strictly guaranteed, and the key problem is to find proper null-space contents that make the result satisfying realness. We notice that the emerging diffusion models (Ho et al., 2020; Dhariwal & Nichol, 2021) are ideal tools to yield ideal null-space contents because they support explicit control over the generation process. In this paper, we propose a novel zero-shot solution for various IR tasks, which we call the Denoising Diffusion Null-Space Model (DDNM). By refining only the null-space contents during the reverse diffusion sampling, our solution only requires an off-the-shelf diffusion model to yield realistic and data-consistent results, without any extra training or optimization nor needing any modifications to network structures. Extensive experiments show that DDNM outperforms state-of-the-art zeroshot IR methods in diverse IR tasks, including super-resolution, colorization, compressed sensing, inpainting, and deblurring. We further propose an enhanced version, DDNM + , which significantly elevates the generative quality and supports solving noisy IR tasks. Our methods are free from domain shifts in degradation modes and thus can flexibly solve complex IR tasks with real-world degradation, such as old photo restoration. Our approaches reveal a promising new path toward solving IR tasks in zero-shots, as the data consistency is analytically guaranteed, and the realness



Figure 1: We use DDNM + to solve various image restoration tasks in a zero-shot way. Here we show some of the results that best characterize our method, where y is the input degraded image and x 0 represents the restoration result. Part (a) shows the results of DDNM + on image super-resolution (SR) from scale 2× to extreme scale 256×. Note that DDNM + assures strict data consistency. Part (b) shows multiple results of DDNM + on inpainting and colorization. Part (c) shows the results of DDNM + on SR with synthetic noise and colorization with real-world noise. Part (d) shows the results of DDNM + on old photo restoration. All the results here are yielded in a zero-shot way.

funding

* Equal contribution. work was supported in part by Shenzhen Research Project under Grant JCYJ20220531093215035 and Grant JSGGZD20220822095800001.

