AN UNSUPERVISED DEEP LEARNING APPROACH FOR REAL-WORLD IMAGE DENOISING

Abstract

Designing an unsupervised image denoising approach in practical applications is a challenging task due to the complicated data acquisition process. In the realworld case, the noise distribution is so complex that the simplified additive white Gaussian (AWGN) assumption rarely holds, which significantly deteriorates the Gaussian denoisers' performance. To address this problem, we apply a deep neural network that maps the noisy image into a latent space in which the AWGN assumption holds, and thus any existing Gaussian denoiser is applicable. More specifically, the proposed neural network consists of the encoder-decoder structure and approximates the likelihood term in the Bayesian framework. Together with a Gaussian denoiser, the neural network can be trained with the input image itself and does not require any pre-training in other datasets. Extensive experiments on real-world noisy image datasets have shown that the combination of neural networks and Gaussian denoisers improves the performance of the original Gaussian denoisers by a large margin. In particular, the neural network+BM3D method significantly outperforms other unsupervised denoising approaches and is competitive with supervised networks such as DnCNN, FFDNet, and CBDNet.

1. INTRODUCTION

Noise always exists during the process of image acquisition and its removing is important for image recovery and vision tasks, e.g., segmentation and recognition. Specifically, the noisy image y is modeled as y = x + n, where x denotes the clean image, n denotes the corrupted noise and image denoising aims at recovering x from y. Over the past two decades, this problem has been extensively explored and many works have been proposed. Among these works, one typical kind of model assumes that the image is corrupted by additive white Gaussian noise (AWGN), i.e., n ∼ N (0, σ 2 I) where N (0, 1) is the standard Gaussian distribution. Representative Gaussian denoising approaches include block matching and 3D filtering (BM3D) (Dabov et al., 2007b) , non-local mean method (NLM) (Buades et al., 2005) , K-SVD (Aharon et al., 2006) and weighted nuclear norm minimization (WNNM) (Gu et al., 2014) , which perform well on AWGN noise removal. However, the AWGN assumption seldom holds in practical applications as the noise is accumulated during the whole imaging process. For example, in typical CCD or CMOS cameras, the noise depends on the underlying context (daytime or nighttime, static or dynamic, indoor or outdoor, etc.) and the camera settings (shutter speed, ISO, white balance, etc.). In Figure 1 , two real noisy images captured by Samsung Galaxy S6 Edge and Google Pixel smartphones are chosen from Smartphone Image Denoising Dataset (SIDD) (Abdelhamed et al., 2018) and three 40 × 40 patches are chosen for illustration of noisy distribution. It is clear that real noise distribution is content dependent and noise in each patch has different statistical properties which can be non-Gaussian. Due to the violation of the AWGN assumption, the performance of

