LOCALIZED RANDOMIZED SMOOTHING FOR COLLECTIVE ROBUSTNESS CERTIFICATION

Abstract

Models for image segmentation, node classification and many other tasks map a single input to multiple labels. By perturbing this single shared input (e.g. the image) an adversary can manipulate several predictions (e.g. misclassify several pixels). Collective robustness certification is the task of provably bounding the number of robust predictions under this threat model. The only dedicated method that goes beyond certifying each output independently is limited to strictly local models, where each prediction is associated with a small receptive field. We propose a more general collective robustness certificate for all types of models. We further show that this approach is beneficial for the larger class of softly local models, where each output is dependent on the entire input but assigns different levels of importance to different input regions (e.g. based on their proximity in the image). The certificate is based on our novel localized randomized smoothing approach, where the random perturbation strength for different input regions is proportional to their importance for the outputs. Localized smoothing Paretodominates existing certificates on both image segmentation and node classification tasks, simultaneously offering higher accuracy and stronger certificates.

1. INTRODUCTION

There is a wide range of tasks that require models making multiple predictions based on a single input. For example, semantic segmentation requires assigning a label to each pixel in an image. When deploying such multi-output classifiers in practice, their robustness should be a key concern. After all -just like simple classifiers (Szegedy et al., 2014) -they can fall victim to adversarial attacks (Xie et al., 2017; Zügner & Günnemann, 2019; Belinkov & Bisk, 2018) . Even without an adversary, random noise or measuring errors can cause predictions to unexpectedly change. We propose a novel method providing provable guarantees on how many predictions can be changed by an adversary. As all outputs operate on the same input, they have to be attacked simultaneously by choosing a single perturbed input, which can be more challenging for an adversary than attacking them independently. We must account for this to obtain a proper collective robustness certificate. The only dedicated collective certificate that goes beyond certifying each output independently (Schuchardt et al., 2021) is only beneficial for models we call strictly local, where each output depends on a small, pre-defined subset of the input. Multi-output classifiers , however, are often only softly local. While all their predictions are in principle dependent on the entire input, each output may assign different importance to different subsets. For example, convolutional networks for image segmentation can have small effective receptive fields (Luo et al., 2016; Liu et al., 2018) , i.e. primarily use a small region of the image in labeling each pixel. Many models for node classification are based on the homophily assumption that connected nodes are mostly of the same class. Thus, they primarily use features from neighboring nodes. Transformers, which can in principle attend to arbitrary parts of the input, may in practice learn "sparse" attention maps, with the prediction for each token being mostly determined by a few (not necessarily nearby) tokens (Shi et al., 2021) . Figure 1 : Localized randomized smoothing applied to semantic segmentation. We assume that the most relevant information for labeling a pixel is contained in other nearby pixels. We partition the input image into multiple grid cells. For each grid cell, we sample noisy images from a different anisotropic distribution that applies more noise to far-away, less relevant cells. Segmenting all noisy images, cropping the result and computing the majority vote yields a local segmentation mask. These per-cell segmentation masks can then be combined into a complete segmentation mask. Softly local models pose a budget allocation problem for an adversary that tries to simultaneously manipulate multiple predictions by crafting a single perturbed input. When each output is primarily focused on a different part of the input, the attacker has to distribute their limited adversarial budget and may be unable to attack all predictions at once. We propose localized randomized smoothing, a novel method for the collective robustness certification of softly local models that exploits this budget allocation problem. It is an extension of randomized smoothing (Lécuyer et al., 2019; Li et al., 2019; Cohen et al., 2019) , a versatile black-box certification method which is based on constructing a smoothed classifier that returns the expected prediction of a model under random perturbations of its input (more details in § 2). Randomized smoothing is typically applied to single-output models with isotropic Gaussian noise. In localized smoothing however, we smooth each output (or set of outputs) of a multi-output classifier using a different distribution that is anisotropic. This is illustrated in Fig. 1 , where the predicted segmentation masks for each grid cell are smoothed using a different distribution. For instance, the distribution for segmenting the top-right cell applies less noise to the top-right cell. The smoothing distribution for segmenting the bottom-left cell applies significantly more noise to the top-right cell. Given a specific output of a softly local model, using a low noise level for the most relevant parts of the input lets us preserve a high prediction quality. Less relevant parts can be smoothed with a higher noise level to guarantee more robustness. The resulting certificates (one per output) explicitly quantify how robust each prediction is to perturbations of which part of the input. This information about the smoothed model's locality can then be used to combine the per-prediction certificates into a stronger collective certificate that accounts for the adversary's budget allocation problem. 1Our core contributions are: • Localized randomized smoothing, a novel smoothing scheme for multi-output classifiers. • An efficient anisotropic randomized smoothing certificate for discrete data. • A collective certificate based on localized randomized smoothing.

2. BACKGROUND AND RELATED WORK

Randomized smoothing. Randomized smoothing is a certification technique that can be used for various threat models and tasks. For the sake of exposition, let us discuss a certificate for l 2 perturbations (Cohen et al., 2019) . Assume we have a D-dimensional input space R D , label set Y and classifier g : R D → Y. We can use isotropic Gaussian noise to construct the smoothed classifier f = argmax y∈Y Pr z∼N (x,σ) [g(z) = y] that returns the most likely prediction of base classifier g under the input distributionfoot_1 . Given an input x ∈ R D and smoothed prediction y = f (x), we can then easily determine whether y is robust to all l 2 perturbations of magnitude ϵ, i.e. whether ∀x ′ : ||x ′ -x|| 2 ≤ ϵ : f (x ′ ) = y. Let q = Pr z∼N (x,σ) [g(z) = y] be the probability of predicting



An implementation will be made available at https://www.cs.cit.tum.de/daml/localized-smoothing. In practice, all probabilities have to be estimated using Monte Carlo sampling (see discussion in § G).

