FAST GEOMETRIC PROJECTIONS FOR LOCAL ROBUSTNESS CERTIFICATION

Abstract

Local robustness ensures that a model classifies all inputs within an p -ball consistently, which precludes various forms of adversarial inputs. In this paper, we present a fast procedure for checking local robustness in feed-forward neural networks with piecewise-linear activation functions. Such networks partition the input space into a set of convex polyhedral regions in which the network's behavior is linear; hence, a systematic search for decision boundaries within the regions around a given input is sufficient for assessing robustness. Crucially, we show how the regions around a point can be analyzed using simple geometric projections, thus admitting an efficient, highly-parallel GPU implementation that excels particularly for the 2 norm, where previous work has been less effective. Empirically we find this approach to be far more precise than many approximate verification approaches, while at the same time performing multiple orders of magnitude faster than complete verifiers, and scaling to much deeper networks. An implementation of our proposed algorithm is available on GitHub 1 .

1. INTRODUCTION

We consider the problem of verifying the local robustness of piecewise-linear neural networks for a given p bound. Precisely, given a point, x, network, F , and norm bound, , this entails determining whether Equation 1 holds. ∀x . x -x p ≤ =⇒ F (x) = F (x ) This problem carries practical significance, as such networks have been extensively shown to be vulnerable to adversarial examples (Papernot et al., 2016; Szegedy et al., 2014) , wherein smallnorm perturbations are chosen to cause arbitrary misclassifications. Numerous solutions have been proposed to address variants of this problem. These can be roughly categorized into three groups: learning rules that aim for robustness on known training data (Croce et al., 2019; Madry et al., 2018; Wong & Kolter, 2018; Zhang et al., 2019; Xiao et al., 2019) , post-processing methods that provide stochastic guarantees at inference time (Cohen et al., 2019; Lecuyer et al., 2018), and network verification (Balunovic et al., 2019; Cheng et al., 2017; Dutta et al., 2018; Ehlers, 2017; Fischetti & Jo, 2018; Gowal et al., 2018; Jordan et al., 2019; Katz et al., 2017; 2019; Singh et al., 2019b; Tjeng & Tedrake, 2017; Wang et al., 2018; Weng et al., 2018) . We focus on the problem of network verification-for a given model and input, determining if Equation 1 holds-particularly for the 2 norm. Historically, the literature has primarily concentrated on the ∞ norm, with relatively little work on the 2 norm; indeed, many of the best-scaling verification tools do not even support verification with respect to the 2 norm. Nonetheless, the 2 norm remains important to consider for "imperceptible" adversarial examples (Rony et al., 2019) . Furthermore, compared to the ∞ norm, efficient verification for the 2 norm presents a particular challenge, as constraint-solving (commonly used in verification tools) in Euclidean space requires a non-linear objective function, and cannot make as effective use of interval-bound propagation. Existing work on verifying local robustness for the 2 norm falls into two primary categories: (1) expensive, but exact decision procedures, e.g., GeoCert (Jordan et al., 2019) and MIP (Tjeng & Tedrake, 2017), or (2) fast, but approximate techniques, e.g., FastLin/CROWN (Weng et al., 2018; Zhang et al., 2018) . While approximate verification methods have shown promise in scaling to larger networks, they may introduce an additional penalty to robust accuracy by flagging non-adversarial points, thus limiting their application in practice. Exact methods impose no such penalty, but as they rely on expensive constraint-solving techniques, they often do not scale well to even networks with a few hundred neurons. In this paper, we focus on bridging the gap between these two approaches. In particular, we present a verification technique for Equation 1 that neither relies on expensive constraint solving nor conservative over-approximation of the decision boundaries. Our algorithm (Section 2) leverages simple projections, rather than constraint solving, to exhaustively search the model's decision boundaries around a point. The performance benefits of this approach are substantial, especially in the case of 2 robustness, where constraint solving is particularly expensive while Euclidean projections can be efficiently computed using the dot product and accelerated on GPU hardware. However, our approach is also applicable to other norms, including ∞ (Section 3.3). Our algorithm is embarassingly parallel, and straightforward to implement with facilities for batching that are available in many popular ML libraries. Additionally, we show how the algorithm can be easily modified to find certified lower bounds for , rather than verifying a given fixed value (Section 2.3). Because our algorithm relies exclusively on projections, it may encounter scenarios in which there is evidence to suggest non-robust behavior, but the network's exact boundaries cannot be conclusively determined without accounting for global constraints (Section 2, Figure 1b ). In such cases, the algorithm will return unknown (though it would be possible to fall back on constraint solving). However, we prove that if the algorithm terminates with a robust decision, then the model satisfies Equation 1, and likewise if it returns not_robust, then an adversarial example exists (Theorem 1). Note that unlike prior work on approximate verification, our approach can often separate not_robust cases from unknown, providing a concrete adversarial example in the former. In this sense, the algorithm can be characterized as sound but incomplete, though our experiments show that in practice the algorithm typically comes to a decision. We show that our implementation outperforms existing exact techniques (Jordan et al., 2019; Tjeng & Tedrake, 2017) by multiple orders of magnitude (Section 3.1, Table 2a and Section 3.3), while rarely being inconclusive on instances for which other techniques do not time out. Moreover, we find our approach enables scaling to far deeper models than prior work -a key step towards verification of networks that are used in practice. Additionally, on models that have been regularized for efficient verification (Croce et al., 2019; Xiao et al., 2019) , our technique performs even faster, and scales to much larger models -including convolutional networks -than could be verified using similar techniques (Section 3.1, Table 2a ). Finally, we compare our work to approximate verification methods (Section 3.2). We find that while our implementation is not as fast as previous work on efficient lower-bound computation for large models (Weng et al., 2018) , our certified lower bounds are consistently tighter, and in some cases minimal (Section 3.2, Table 2b ).

2. ALGORITHM

In this section we give a high-level overview of our proposed algorithm, and present some implementation heuristics that improve its performance (Section 2.2). We also propose a variant (Section 2.3) to compute certified lower bounds of the robustness radius. Correctness proofs for all of the algorithms discussed in this section are provided in Appendix A. Because our algorithm applies to arbitrary p

