FUNDAMENTAL LIMITS ON THE ROBUSTNESS OF IMAGE CLASSIFIERS

Abstract

We prove that image classifiers are fundamentally sensitive to small perturbations in their inputs. Specifically, we show that given some image space of n-by-n images, all but a tiny fraction of images in any image class induced over that space can be moved outside that class by adding some perturbation whose p-norm is O(n 1/ max (p,1) ), as long as that image class takes up at most half of the image space. We then show that O(n 1/ max (p,1) ) is asymptotically optimal. Finally, we show that an increase in the bit depth of the image space leads to a loss in robustness. We supplement our results with a discussion of their implications for vision systems.

1. INTRODUCTION

Image classification, the task of partitioning images into various classes, is a classical problem in computer science with countless practical applications. Progress on this problem has advanced with leaps and bounds since the advent of deep learning, with modern image classifiers attaining some incredible results (Beyer et al., 2020) . However, it has been observed that image classes tend to be brittle -classifiers like to partition images in a way such that most images lie very close to images of different classes (Szegedy et al., 2013) . Although usually studied in computer vision systems, such phenomena also appear to manifest in natural vision systems (Elsayed et al., 2018; Zhou & Firestone, 2019) . Given these observations, it is natural to ask the following question: is the brittleness of image classes a result of classifier construction, or does it arise from some fundamental property of image spaces? Previous work demonstrate that classifiers can be made more robust as a function of how they are constructed, and attempts to improve the robustness of existing computer vision systems through such means is an active area of research (Moosavi-Dezfooli et al., 2016; Madry et al., 2017; Ma et al., 2018; Tramer et al., 2020; Machado et al., 2021) . However, there is also a fundamental limit to the robustness achievable by any classifier that arises as a consequence of the geometry of image spaces. In this work we show that this fundamental limit of achievable robustness is surprisingly low. Roughly speaking, in most cases it suffices to change the contents of only a few columns of pixels in an image to change its class. Even smaller changes are sufficient when measured using other metrics, such as the Euclidean metric. Our results are a consequence of the geometry of image spaces, and so they apply regardless of the architecture of the classifier. This suggests that there is an inherent brittleness in the semantic content of images, and that robustness as an objective is only desirable with respect to distributions that are concentrated over small subsets of the image space.

1.1. OUR CONTRIBUTIONS AND RELATED WORK

The observation that image classes tend to be brittle was popularized by Szegedy et al. (2013) , where it was observed that tiny perturbations suffice to change the image class of many images. This has since opened up a rich field of research on how the brittleness of image classes arise from specific classifier formulations or training distributions (Goodfellow et al., 2014; Gilmer et al., 2018; Tsipras et al., 2018) foot_0 . While these advances offer insights into the deficiencies of our current methodologies, their analyses ultimately depend on some aspect of the architecture or training distribution, so do not rule out the possible existence of ideal classifiers that do not induce brittle image classes. By contrast, our work provides a non-trivial upper bound to the robustness of any image classfoot_1 . Specifically: • We prove that most images in any image class consisting of at most half the images in an image space of n-by-n images can be moved into a different class by adding a perturbation whose p-norm is O(n 1/ max (p,1) ). This is a vanishingly small quantity in relation to the average distances in the image space, which is O(n 2/ max (p,1) ), and therefore provides a non-trivial upper bound for the robustness attainable by even an ideal classifier. • We show that there exist image classes where most images cannot be moved into a different class with any perturbation whose p-norm is o(n 1/ max (p,1) ) (note the small-o notationfoot_2 ). Therefore, the bound we derive is asymptotically optimal in n, so proving stronger robustness bounds will require examining classifier-specific properties. • We show that discretization through lowering the bit depth of the image space permits the existence of more robust image classes. This lends theoretical backing to the idea of using discretization as a method of defending against adversarial attacks (Panda et al., 2019) . • We demonstrate that brittle features in images can deliver semantic content. We argue that a deeper understanding of such features can lead to advances in aligning human and computer vision systems. To our knowledge, there are two previous works that investigate upper bounds of robustness that arise from the geometry of image spaces. One is from Fawzi et al. ( 2018a), which provides an upper bound for the probability that an image drawn from a given distribution is far from images of a different class. They further perform numerical experimental analyses of their bounds. However, our analysis differs and improves on theirs in a few key aspects. Firstly, they only analyze the case where distance is measured using the 2-norm, while we provide bounds for p-norms for any p. Secondly, they do not account for the discrete nature of image spaces with finite bit depth, which allows for classifiers that are more robust than their bounds implyfoot_3 . Finally, their bound is parametrized by a modulus of continuity which differs depending on the image distribution, potentially resulting in trivial bounds for certain distributions. Furthermore, this parameter cannot be computed exactly, so in application their bound is inexact. By contrast, we formulate our results independently of specific image distributionsfoot_4 . Our bounds can therefore be computed exactly and unconditionally, and we are able to show the asymptotic optimality of our result. The other work is from Diochnos et al. ( 2018), which investigates partitions of bit vectors. Since bit vectors can be used to encode discrete inputs, their results can be viewed as results about classifiers over discrete inputs. They also view each bit vector as being equally weighted, so their results are not parametrized by data distributions and are unambiguous. They show that given a finite probability of misclassification, an arbitrarily high proportion of vectors can be turned into misclassified vectors through small numbers of bit flips proportional to the square root of the vector dimension. This result has been generalized in follow-up work (Mahloujifar et al., 2019) , where it was shown that a small number of modifications proportional to the square root of the data dimension suffices to induce misclassification in the more general setting of Lévy families as well. However, these results are still dependent on the existence of a finite fraction of misclassified datapoints, and therefore do not preclude the existence of asymptotically infinitesimal image classes that are robust, something which



Goodfellow et al. (2014) demonstrate how linearity can allow imperceptible changes across a large amount of dimensions to accumulate to something significant, Gilmer et al. (2018) derive a relation between classification error rate and the distance to the closest misclassification on a specific dataset of concentric spheres, and Tsipras et al. (2018) derive a fundamental tradeoff between accuracy and robustness. Consisting of at most half the images in the image space. f (n) ∈ O(g(n)) ⇐⇒ lim sup n→∞ f (n) g(n) ≤ ∞ and f (n) ∈ o(g(n)) ⇐⇒ limn→∞ f (n) g(n) = 0 Other work, such as that ofDiochnos et al. (2018), does analyze discrete input spaces, but does not investigate the relation between discretization and robustness. It is still possible adapt our result to account for image distribution using their techniques. See our discussion for additional details.

