TIGHT SECOND-ORDER CERTIFICATES FOR RANDOMIZED SMOOTHING

Abstract

Randomized smoothing is a popular way of providing robustness guarantees against adversarial attacks: randomly-smoothed functions have a universal Lipschitz-like bound, allowing for robustness certificates to be easily computed. In this work, we show that there also exists a universal curvature-like bound for Gaussian random smoothing: given the exact value and gradient of a smoothed function, we compute a lower bound on the distance of a point to its closest adversarial example, called the Second-order Smoothing (SoS) robustness certificate. In addition to proving the correctness of this novel certificate, we show that SoS certificates are realizable and therefore tight. Interestingly, we show that the maximum achievable benefits, in terms of certified robustness, from using the additional information of the gradient norm are relatively small: because our bounds are tight, this is a fundamental negative result. The gain of SoS certificates further diminishes if we consider the estimation error of the gradient norms, for which we have developed an estimator. We therefore additionally develop a variant of Gaussian smoothing, called Gaussian dipole smoothing, which provides similar bounds to randomized smoothing with gradient information, but with much-improved sample efficiency. This allows us to achieve (marginally) improved robustness certificates on high-dimensional datasets such as CIFAR-10 and ImageNet. Code is available at https://github.com/alevine0/smoothing_second_ order.

1. INTRODUCTION

A topic of much recent interest in machine learning has been the design of deep classifiers with provable robustness guarantees. In particular, for an m-class classifier h : R d → [m], the L 2 certification problem for an input x is to find a radius ρ such that, for all δ with δ 2 < ρ, h(x) = h(x + δ). This robustness certificate serves as a lower bound on the magnitude of any adversarial perturbation of the input that can change the classification: therefore, the certificate is a security guarantee against adversarial attacks. There are many approaches to the certification problem, including exact methods, which compute the precise norm to the decision boundary (Tjeng et al., 2019; Carlini et al., 2017; Huang et al., 2017) as well as methods for which the certificate ρ is merely a lower bound on the distance to the decision boundary (Wong & Kolter, 2018; Gowal et al., 2018; Raghunathan et al., 2018) . One approach that belongs to the latter category is Lipschitz function approximation. Recall that a function f : R d → R is L-Lipschitz if, for all x, x , |f (x) -f (x )| ≤ L x -x 2 . If a classifier is known to be a Lipschitz function, this immediately implies a robustness certificate. In particular, consider a binary classification for simplicity, where we use an L-Lipschitz function f as a classifier, using the sign of f (x) as the classification. Then for any input x, we are assured that the classification (i.e, the sign) will remain constant for all x within a radius |f (x)|/L of x. Numerous methods for training Lipschitz neural networks with small, known Lipschitz constants have been proposed. (Fazlyab et al., 2019; Zhang et al., 2019; Anil et al., 2019; Li et al., 2019b) It is desirable that the network be as expressive as possible, while still maintaining the desired Lipschitz property. Anil et al. (2019) in particular demonstrates that their proposed method can universally approximate Lipschitz functions, given sufficient network complexity. However, in practice, for the robust certification problem on large-scale input, randomized smoothing (Cohen et al., 2019) is the

