SELECTIVE FREQUENCY NETWORK FOR IMAGE RESTORATION

Abstract

Image restoration aims to reconstruct the latent sharp image from its corrupted counterpart. Besides dealing with this long-standing task in the spatial domain, a few approaches seek solutions in the frequency domain in consideration of the large discrepancy between spectra of sharp/degraded image pairs. However, these works commonly utilize transformation tools, e.g., wavelet transform, to split features into several frequency parts, which is not flexible enough to select the most informative frequency component to recover. In this paper, we exploit a multi-branch and content-aware module to decompose features into separate frequency subbands dynamically and locally, and then accentuate the useful ones via channel-wise attention weights. In addition, to handle large-scale degradation blurs, we propose an extremely simple decoupling and modulation module to enlarge the receptive field via global and window-based average pooling. Integrating two developed modules into a U-Net backbone, the proposed Selective Frequency Network (SFNet) performs favorably against state-of-the-art algorithms on five image restoration tasks, including single-image defocus deblurring, image dehazing, image motion deblurring, image desnowing, and image deraining 1 .

1. INTRODUCTION

Image restoration aims to recover a high-quality image by removing degradations, e.g., noise, blur, and snowflake. In view of its important role in surveillance, self-driving techniques, and remote sensing, image restoration has gathered considerable attention from industrial and academic communities. However, due to its ill-posed property, many conventional approaches address this problem based on various assumptions (Zhang et al., 2022; Yang et al., 2020b) (Liu et al., 2019) . However, with convolution units, these methods have limited receptive fields, and thus they are not capable of capturing long-range dependencies. This requirement is essential for restoration tasks, since a single pixel needs information from its surrounding region to be recovered. More recently, many researchers have tailored Transformer (Vaswani et al., 2017) for image restoration tasks, such as motion deblurring (Tsai et al., 2022 ), dehazing (Guo et al., 2022; Song et al., 2022) and desnowing (Chen et al., 2022b; c) . Nonetheless, the above-mentioned methods mainly conduct restoration in the spatial domain, which do not sufficiently leverage frequency discrepancies between sharp/degraded image pairs. To this end, a few works utilize the transformation tools, e.g., wavelet transform or Fourier transform, to



or hand-crafted features (Karaali & Jung, 2017), which are incapable of generating faithful results in real-world scenarios. Recently, deep neural networks have witnessed the rapid development of image restoration and obtained favorable performance compared to conventional methods. A flurry of convolutional neural networks (CNN) based methods have been developed for diverse image restoration tasks by inventing or borrowing advanced modules, including dilated convolution (Luo et al., 2022; Zou et al., 2021), U-Net (Ronneberger et al., 2015), residual learning (Zhang et al., 2017), multi-stage pipeline (Zhang et al., 2019b), and attention mechanisms

