JUST AVOID ROBUST INACCURACY: BOOSTING RO-BUSTNESS WITHOUT SACRIFICING ACCURACY

Abstract

While current methods for training robust deep learning models optimize robust accuracy, they significantly reduce natural accuracy, hindering their adoption in practice. Further, the resulting models are often both robust and inaccurate on numerous samples, providing a false sense of safety for those. In this work, we extend prior works in three main directions. First, we explicitly train the models to jointly maximize robust accuracy and minimize robust inaccuracy. Second, since the resulting models are trained to be robust only if they are accurate, we leverage robustness as a principled abstain mechanism. Finally, this abstain mechanism allows us to combine models in a compositional architecture that significantly boosts overall robustness without sacrificing accuracy. We demonstrate the effectiveness of our approach for empirical and certified robustness on six recent state-of-theart models and four datasets. For example, on CIFAR-10 with ε ∞ = 1/255, we successfully enhanced the robust accuracy of a pre-trained model from 26.2% to 87.8% while even slightly increasing its natural accuracy from 97.8% to 98.0%.

1. INTRODUCTION

In recent years, there has been a significant amount of work that studies and improves adversarial (Carlini & Wagner, 2017; Croce & Hein, 2020b; Goodfellow et al., 2014; Madry et al., 2018; Szegedy et al., 2013) and certified robustness (Balunovic & Vechev, 2019; Cohen et al., 2019; Salman et al., 2019; Xu et al., 2020; Zhai et al., 2020; Zhang et al., 2019b) of neural networks. However, currently, there is a key limitation that hinders the wider adoption of robust models in practice. Robustness vs Accuracy Tradeoff Despite substantial progress in training robust models, existing robust training methods typically improve model robustness at the cost of decreased standard accuracy. To address this limitation, a number of recent works study this issue in detail and propose new methods to mitigate it (Mueller et al., 2020; Raghunathan et al., 2020; Stutz et al., 2019; Yang et al., 2020) .

Our Work

In this work, we advance the line of work that aims to boost robustness without sacrificing accuracy, but we approach the problem from a new perspective -by avoiding robust inaccuracy. Concretely, we propose a new training method that jointly maximizes robust accuracy while minimizing robust inaccuracy. We illustrate the effect of our training on a synthetic dataset (three classes sampled from Gaussian distributions) in Figure 1 , showing the decision boundaries of three models, trained using standard training L std , adversarial training L TRADES (Zhang et al., 2019a) , and our training L ERA (Equation 4). First, observe that while the L std trained model achieves 100% accuracy, only 91.1% of these samples are robust (and accurate). When using L TRADES , we can observe the Table 1 : Improvement of applying our approach to models trained to optimize natural accuracy only. Here, R acc rob denotes the robust accuracy and R nat denotes the standard (non-adversarial) accuracy. ------→ 41.9 40.7 +29.2% ------→ 69.9 44.7 +37.7% ------→ 82.4 Rnat 97.8 +0.2% -----→ 98.0 80.17 



al. (2020), B ∞ 1/255 (WideResNet-28-10), B ∞ 2/255 (ResNet-50), B ∞ 2/255 (ResNet-50), B ∞

