FREQUENCY REGULARIZED DEEP CONVOLUTIONAL DICTIONARY LEARNING AND APPLICATION TO BLIND DENOISING

Abstract

Sparse representation via a learned dictionary is a powerful prior for natural images. In recent years, unrolled sparse coding algorithms (e.g. LISTA) have proven to be useful for constructing interpretable deep-learning networks that perform on par with state-of-the-art models on image-restoration tasks. In this study we are concerned with extending the work of such convolutional dictionary learning (CDL) models. We propose to construct strided convolutional dictionaries with a single analytic low-pass filter and a set of learned filters regularized to occupy the complementary frequency space. By doing so, we address the necessary modeling assumptions of natural images with respect to convolutional sparse coding and reduce the mutual coherence and redundancy of the learned filters. We show improved denoising performance at reduced computational complexity when compared to other CDL methods, and competitive results when compared to popular deep-learning models. We further propose to parameterize the thresholds in the soft-thresholding operator of LISTA to be proportional to the estimated noise-variance from an input image. We demonstrate that this parameterization enhances robustness to noise-level mismatch between training and inference.

1. INTRODUCTION

Sparsity in a transform domain is an important and widely applicable property of natural images. This property can be exploited in a variety of tasks such as signal representation, feature extraction, and image processing. For instance, consider restoring an image from a degraded version (noisy, blurry, or missing pixels). These inverse problems are generally ill-posed and require utilizing adequate prior knowledge, for which sparsity has proven extremely effective (Mairal et al., 2014) . In recent years, such problems have been tackled with deep neural network architectures that achieve superior performance but are not well-understood in terms of their building blocks. In this study, we are interested in utilizing the knowledge from classical signal processing and spare coding literature to introduce a learned framework which is interpretable and that can perform on-par with state-ofthe-art deep-learning methods. We choose to explore this method under the task of natural image denoising, in line with much of the recent literature (Sreter & Giryes, 2018; Simon & Elad, 2019; Lecouat et al., 2020) . As a benefit of this interpretability, we are able to extend the framework for a blind-denoising setting using ideas from signal processing. In sparse representation we seek to approximate a signal as a linear combination of a few vectors from a set of vectors (usually called dictionary atoms). Olshausen & Field (1996) , following a neuroscientific perspective, proposed to adapt the dictionary to a set of training data. Later, dictionary learning combined with sparse coding was investigated in numerous applications (Mairal et al., 2009a; Protter & Elad, 2008) . More specifically, for a set of N image patches (reshaped into column vectors) (1) X = [x 1 , • • • , x N ] ∈ R m×N ,



we seek to find the dictionary D * ∈ R m×k and the sparse representationZ * = [z * 1 , • • • , z * N ] ∈ R k×N such that D * , Z * = arg min D,Z N i=1 z i 0 subject to: Dz i = x i , ∀i = 1, • • • , N.

