RETINEXUTV: ROBUST RETINEX MODEL WITH UNFOLDING TOTAL VARIATION

Abstract

Digital images are underexposed due to poor scene lighting or hardware limitations, reducing visibility and level of detail in the image, which will affect subsequent high-level tasks and image aesthetics. Therefore, it is of great practical significance to enhance low-light images. Among existing low-light image enhancement techniques, Retinex-based methods are the focus today. However, most Retinex methods either ignore or poorly handle noise during enhancement, which can produce unpleasant visual effects in low-light image enhancement and affect high-level tasks. In this paper, we propose a robust low-light image enhancement method RetinexUTV, which aims to enhance low-light images well while suppressing noise. In RetinexUTV, we propose an adaptive illumination estimation unfolded total variational network, which approximates the noise level of the real low-light image by learning the balance parameter of the total variation regularization term of the model, obtains the noise level map and the smooth noise-free sub-map of the image. The initial illumination map is then estimated by obtaining the illumination information of the smooth sub-map. The initial reflection map is obtained through the initial illumination map and original image. Under the guidance of the noise level map, the noise of the reflection map is suppressed, and finally it is multiplied by the adjusted illumination map to obtain the final enhancement result. We test our method on real low-light datasets LOL, VELOL, and experiments demonstrate that our method outperforms state-of-theart methods.

1. INTRODUCTION

Recording people's lives by taking pictures or videos is becoming more and more popular. However, since most users lack professional shooting skills, many photos are captured in less than ideal lighting conditions such as night time and backlight. Such images have low contrast, strong noise, and unclear details, which not only affect the human visual experience, but also limit the application of many computer vision algorithms, such as object recognition and object detection. There are several approaches to enhance low-light images, including histogram equalization (Abdullah-Al-Wadud et al., 2007; Pizer, 1990) , inverse domain operations (Li et al., 2015; Zhang et al., 2016) , Retinex decomposition (Xiao & Shi, 2013; Jobson et al., 1997a; b; Herscovitz & Yadid-Pecht, 2004 ) and deep learning (Yang et al., 2016; Lore et al., 2017) . Histogram equalization-based methods flatten the histogram and stretch the dynamic range of intensity, thereby amplifying the illumination of low-light images. If noise is not specifically considered, noise and artifacts will be amplified in its results. Some researchers noticed the similarity between haze images and inverted low-light images. Therefore, these inverse domain-based methods apply de-overlapping methods to enhance low-light images. To jointly adjust illumination and suppress noise, a method based on Retinex theory is proposed. Methods based on Retinex decomposition treat the scene in the human eye as the product of the reflection layer and the illumination layer. Enhanced results are produced by adjusting the corresponding layers. The earliest methods directly regard the decomposed reflection layer as the enhancement result (Jobson et al., 1997b; a; Xiao & Shi, 2013; Herscovitz & Yadid-Pecht, 2004) . Single-scale Retinex (SSR) (Jobson et al., 1997b) and Multi-scale Retinex (MSR) (Jobson et al., 1997a) utilize Gaussian filters to build Retinex representations. In (Xiao & Shi, 2013) , a bilateral filter is used to remove halo artifacts. Later methods adjust the illumination and reflection layers and reconstruct the enhanced result by combining them. In (Kimmel et al., Due to the interpretability of the Retinex theory and the ease of modeling, more and more methods have been proposed to combine deep learning with Retinex. The first one to combine deep learning with Retinex is RetinexNet (Wei et al., 2018) , which uses a decomposition network and a brightness adjustment network. The two sub-networks complete the enhancement of the dark light image, and the subsequent methods are improved on this basis, such as adding a denoising network to the decomposed reflection image, and adding a brightness correction network to the decomposed lighting network, such as KinD (Zhang et al., 2019) . These methods are all trained under the expected noise model, but still lack robustness in real dark-light environments because the noise exhibits different noise levels. In our proposed method, we use the unfolding total variational model to estimate the noise level map and the illumination map, then the reflection map is obtained through the illumination map, and the reflection map is denoised through the guidance of the noise level map. In the Retinex theory, the anticipate illumination map needs to be smooth enough in space, while the reflection map needs edge details to represent the essence of the object. In this paper, we consider noise as a non-negligible factor in Retinex-based decomposition. The proposed model and method are noiseaware throughout the process, rather than in the form of individual ad-hoc operations. Compared with previous methods that only consider light noise modeling, this paper also aims to model and remove strong noise in low-light images (Wang et al., 2020) . Firstly, we build a unfolding total variational model to estimate the noise level map and noise-free smooth map, and then obtain the illumination information of the noise-free smooth map to obtain the original illumination map. Then according to the retienex theory, the original illumination map and the original image are calculated to obtain the original reflection map, and the reflection map is denoised by the guidance of the noise level map. The illumination map is adjusted by the light adjustment network to obtain the adjusted illumination map. Finally, multiply them to get the enhanced image. Specific process is shown in Figure 1 . The contributions of this paper are mainly reflected in the following aspects:



Figure1: The proposed framework. Our sequential processing is divided into two stages. In the first stage, the illumination map and noise level map are obtained by using the unfolding total variation. In the second stage, the reflection map is obtained by calculating the illumination map and the original image. Input the reflection map and the noise level map into the non-blind denoising subnetwork to suppress the noise to obtain reflection map without noise, then adjusted with the illumination map to obtain an enhanced denoised image.

