PROVABLE DEFENSE AGAINST GEOMETRIC TRANSFORMATIONS

Abstract

Geometric image transformations that arise in the real world, such as scaling and rotation, have been shown to easily deceive deep neural networks (DNNs). Hence, training DNNs to be certifiably robust to these perturbations is critical. However, no prior work has been able to incorporate the objective of deterministic certified robustness against geometric transformations into the training procedure, as existing verifiers are exceedingly slow. To address these challenges, we propose the first provable defense for deterministic certified geometric robustness. Our framework leverages a novel GPU-optimized verifier that can certify images between 60× to 42,600× faster than existing geometric robustness verifiers, and thus unlike existing works, is fast enough for use in training. Across multiple datasets, our results show that networks trained via our framework consistently achieve state-of-the-art deterministic certified geometric robustness and clean accuracy. Furthermore, for the first time, we verify the geometric robustness of a neural network for the challenging, real-world setting of autonomous driving.

1. INTRODUCTION

Despite the widespread success of deep neural networks (DNNs), they remain surprisingly susceptible to misclassification when small adversarial changes are applied to correctly classified inputs (Goodfellow et al., 2015; Kurakin et al., 2018) . This phenomenon is especially concerning as DNNs are increasingly being deployed in many safety-critical domains, such as autonomous driving (Bojarski et al., 2016; Sitawarin et al., 2018) and medical imaging (Finlayson et al., 2019) . As a result, there have been widespread efforts aimed at formally verifying the robustness of DNNs against norm-based adversarial perturbations (Gehr et al., 2018; Singh et al., 2019; Weng et al., 2018; Zhang et al., 2018) and designing novel mechanisms for incorporating feedback from the verifier to train provably robust networks with deterministic guarantees (Gowal et al., 2019; Mirman et al., 2018; Xu et al., 2020; Zhang et al., 2020) . However, recent works (Dreossi et al., 2018; Engstrom et al., 2019; Hendrycks & Dietterich, 2019; Kanbak et al., 2018; Liu et al., 2019) have shown that geometric transformations -which capture real-world artifacts like scaling and changes in contrast -can also easily deceive DNNs. No prior work has formulated the construction of a deterministic provable defense needed to ensure DNN safety against geometric transformations. Further, existing deterministic verifiers for geometric perturbations (Balunovic et al., 2019; Mohapatra et al., 2020) are severely limited by their scalability and cannot be used during training for building provable defenses. Probabilistic geometric robustness verifiers (Fischer et al., 2020; Hao et al., 2022; Li et al., 2021) are more scalable but may be inadequate for safety-critical applications like autonomous driving, since they may falsely label an adversarial region as robust. These limitations have prevented the development of deterministic provable defenses against geometric transformations thus far.

Challenges.

Training networks to be certifiably robust against geometric transformations carries multiple challenges that do not arise with norm-based perturbations. First, geometric transformations are much more difficult to formally reason about than ℓ p perturbations, as unlike an ℓ p -ball, the adversarial region of a geometric transformation is highly nonuniform and cannot be directly represented as a symbolic formula encoding a convex shape. Computing this adversarial input region for geometric perturbations is indeed the main computational bottleneck faced by existing geometric robustness verifiers (Balunovic et al., 2019; Mohapatra et al., 2020) , thus making the overall verification too expensive for use during training. Hence, training DNNs for deterministic certified robustness against geometric perturbations requires not only formulating the construction of a provable defense, but also completely redesigning geometric robustness verifiers for scalability. This Work. To address the outlined challenges, we propose Certified Geometric Training (CGT), a framework for training neural networks that are deterministically certified robust to geometric transformations. The framework consists of (1) the Fast Geometric Verifier (FGV), a novel method to perform geometric robustness certification that is orders of magnitude faster than the state-of-theart and (2) computationally efficient loss functions that embed FGV into the training procedure. We empirically evaluate our method on the MNIST (LeCun et al., 1998 ), CIFAR10 (Krizhevsky, 2009) , Tiny ImageNet (Le & Yang, 2015) , and Udacity self-driving car (Udacity, 2016) datasets to demonstrate CGT's effectiveness. Our results show that CGT-trained networks consistently achieve state-of-the-art clean accuracy and certified robustness; furthermore, FGV is between 60× to 42,600× faster than the state-of-the-art verifier for certifying each image. We also achieve several breakthroughs: (1) FGV enables us to certify deterministic robustness against geometric transformations on entire test sets of 10,000 images, which is more than 50× the number of images over existing works (100 in Balunovic et al. ( 2019) and 200 in Mohapatra et al. ( 2020)); (2) we are the first to scale deterministic geometric verification beyond CIFAR10; and (3) we are the first to verify a neural network for autonomous driving under realistic geometric perturbations. Our code is publicly available at https://github.com/uiuc-arc/CGT.

2. RELATED WORK

Geometric Robustness Certification. Certification of geometric perturbations (such as image rotations or scaling) going beyond ℓ p -norm attacks have recently begun to be studied in the literature. There are two main approaches to formally verifying the geometric robustness of neural networks: deterministic and probabilistic. Most recent works on geometric robustness verification use randomized smoothing-based techniques to obtain probabilistic guarantees of robustness (Fischer et al., 2020; Hao et al., 2022; Li et al., 2021) . While these approaches can scale to larger datasets, their analyses are inherently unsound, i.e., they may falsely label an adversarial region as robust. For safety-critical domains, this uncertainty may be undesirable. Furthermore, such guarantees are obtained over a smoothed version of a base network, which at inference time requires sampling (i.e., repeatedly evaluating) the network up to 10,000 times per image, thereby introducing significant runtime or memory overhead (Fischer et al., 2020) . Our work focuses on deterministic verification, whose analysis is always guaranteed to be correct and whose certified network incurs no overhead during inference. In particular, this type of certificate always holds against any adaptive attacker. However, the existing deterministic geometric robustness verifiers (Balunovic et al., 2019; Mohapatra et al., 2020) are exceedingly slow and thus unsuitable for training, due to the high cost of abstracting the geometric input region (even before propagating it through the network). Singh et al. ( 2019) study deterministic robustness against rotations, but their results are subsumed by Balunovic et al. (2019) . While some works have focused on scaling verifiers (Müller et al., 2021; Xu et al., 2020) , they have not targeted geometric robustness as we do. Provable Defenses Against Norm-based Perturbations. Prior works (Gowal et al., 2019; Mirman et al., 2018; Xu et al., 2020; Zhang et al., 2020) incorporate verification explicitly into the training loop so that the trained networks become easier to verify. Typically, this task is accomplished by formulating a loss function that strikes a balance between the desired formal guarantees (i.e., high certified robustness) and clean accuracy (i.e., the network's accuracy on the original dataset). To ensure quick loss computation over a large set of training images, interval bound propagation (IBP) (Gowal et al., 2019; Mirman et al., 2018) is the most popular technique. However, previous IBP works mainly consider norm-based perturbations, which do not capture the highly nonuniform adversarial input regions that characterize geometric transformations.

3. BACKGROUND

We denote a neural network as a function f : R C×H×W → R no from an H × W image with C channels to n o real values. We focus on feedforward networks, but our approach is general and equally applicable to other architectures. Let x ∈ R C×H×W denote an input image and y denote its

