DEEP GENERATIVE MODEL BASED RATE-DISTORTION FOR IMAGE DOWNSCALING ASSESSMENT Anonymous authors Paper under double-blind review

Abstract

In this paper, we propose a novel measure, namely Image Downscaling Assessment by Rate-Distortion (IDA-RD), to quantitatively evaluate image downscaling algorithms. In contrast to image-based methods that measure the quality of downscaled images, ours is process-based that draws ideas from the rate-distortion theory to measure the distortion incurred during downscaling. Our main idea is that downscaling and super-resolution (SR) can be viewed as the encoding and decoding processes in the rate-distortion model, respectively, and that a downscaling algorithm that preserves more details in the resulting low-resolution (LR) images should lead to less distorted high-resolution (HR) images in SR. In other words, the distortion should increase as the downscaling algorithm deteriorates. However, it is non-trivial to measure this distortion as it requires the SR algorithm to be blind and stochastic. Our key insight is that such requirements can be met by recent SR algorithms based on deep generative models that can find all matching HR images for a given LR image on their learned image manifolds. Empirically, we first validate our IDA-RD measure with synthetic downscaling algorithms which simulate distortions by adding various types and levels of degradations to the downscaled images. We then test our measure on traditional downscaling algorithms such as bicubic, bilinear, nearest neighbor interpolation as well as state-of-the-art downscaling algorithms such as DPID (Weber et al., 2016), L0-regularized downscaling (Liu et al., 2017), and Perceptual downscaling (Oeztireli & Gross, 2015). Experimental results show the effectiveness of our IDA-RD in evaluating image downscaling algorithms.

1. INTRODUCTION

Image downscaling is a fundamental problem in image processing and computer vision. To address the diverse application scenarios, various digital devices with different resolutions, such as smartphones, iPads, and desktop monitors, co-exist, which makes this problem even more important. In contrast to image super-resolution (SR), which aims to "add" information to low-resolution (LR) images, image downscaling algorithms focus on "preserving" information present in the highresolution (HR) images, which is particularly important for applications and devices with very limited screen spaces. Traditional image downscaling algorithms low-pass filter an image before resampling it. While this prevents aliasing in the downscaled LR image, important high-frequency details of the HR image are removed simultaneously, resulting in a blurred or overly-smooth LR image. To improve the quality of downscaled images, several sophisticated approaches have been proposed recently, including remapping of high-frequency information (Gastal & Oliveira, 2017), optimization of perceptual image quality metrics (Oeztireli & Gross, 2015) , using L0-regularized priors (Liu et al., 2017) , and pixelizing the HR image (Gerstner et al., 2012; Han et al., 2018; Kuang et al., 2021; Shang & Wong, 2021) . Nevertheless, research in image downscaling algorithms has significantly slowed down due to the lack of a quantitative measure to evaluate them. Specifically, standard distance measures (e.g. L1, L2 norm) and full-reference image quality assessment (IQA) methods are not applicable here due to the absence of ground truth LR images; existing No-Reference IQA (NR-IQA) metrics (Mittal et al., 2012b; a; Bosse et al., 2017 ) cannot be applied either as they rely on the "naturalness" of HR images, which is not present in LR images (we will verify this in our experiments).

