DISTRIBUTED ADVERSARIAL TRAINING TO ROBUS-TIFY DEEP NEURAL NETWORKS AT SCALE

Abstract

Current deep neural networks are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification. To defend against such attacks, an effective and popular approach, known as adversarial training, has been shown to mitigate the negative impact of adversarial attacks by virtue of a min-max robust training method. While effective, this approach is difficult to scale well to large models on large datasets (e.g., ImageNet) in general. To address this challenge, we propose distributed adversarial training (DAT), a large-batch adversarial training framework implemented over multiple machines. DAT supports one-shot and iterative attack generation methods, gradient quantization, and training over labeled and unlabeled data. Theoretically, we provide, under standard conditions in the optimization theory, the convergence rate of DAT to the first-order stationary points in general non-convex settings. Empirically, on ResNet-18 and -50 under CIFAR-10 and ImageNet, we demonstrate that DAT either matches or outperforms state-of-the-art robust accuracies and achieves a graceful training speedup.

1. INTRODUCTION

The rapid increase of research in deep neural networks (DNNs) and their adoption in practice is, in part, owed to the significant breakthroughs made with DNNs in computer vision (Alom et al., 2018 ). Yet, with the apparent power of DNNs, there remains a serious weakness of robustness. That is, DNNs can easily be manipulated (by an adversary) to output drastically different classifications and can be done so in a controlled and directed way. This process is known as an adversarial attack and considered as one of the major hurdles in using DNNs in security critical and real-world applications (Goodfellow et al., 2015; Szegedy et al., 2013; Carlini & Wagner, 2017; Papernot et al., 2016; Kurakin et al., 2016; Eykholt et al., 2018; Xu et al., 2019b) . Methods to train DNNs being robust against adversarial attacks are now a major focus in research (Xu et al., 2019a) . But most are far from satisfactory (Athalye et al., 2018) with the exception of the adversarial training (AT) approach (Madry et al., 2017b) . AT is a min-max robust training method that minimizes the worst-case training loss at adversarially perturbed examples. AT has inspired a wide range of state-of-the-art defenses (Kannan et al., 2018; Ross & Doshi-Velez, 2018; Moosavi-Dezfooli et al., 2019; Zhang et al., 2019b; Wang et al., 2019b; Sinha et al., 2018; Chen et al., 2019; Boopathy et al., 2020; Wong & Kolter, 2017; Dvijotham et al., 2018; Stanforth et al., 2019; Carmon et al., 2019; Shafahi et al., 2019; Zhang et al., 2019a; Wong et al., 2020) , which ultimately resort to min-max optimization. However, these methods, together with AT, are generally difficult to scale well to large networks on large datasets. While scaling AT is important, doing so effectively is non-trivial. We find that scaling AT with the direct solution of distributing the data batch across multiple machines may not work and leaves many unanswered questions. First, if the direct solution does not allow for scaling batch size with machines, then it does not speed the process up and leads to a significant amount of communication costs (considering that the number of training iterations is not reduced over a fixed number of epochs). Second, without proper design, the direct application of a large batch size to distributed adversarial training introduces a significant loss in both normal accuracy and adversarial robustness (e.g., more than 10% performance drops for ResNet-18 on CIFAR-10 shown by our experiments). Third, the

