CERTIFY OR PREDICT: BOOSTING CERTIFIED ROBUSTNESS WITH COMPOSITIONAL ARCHITECTURES

Abstract

A core challenge with existing certified defense mechanisms is that while they improve certified robustness, they also tend to drastically decrease natural accuracy, making it difficult to use these methods in practice. In this work, we propose a new architecture which addresses this challenge and enables one to boost the certified robustness of any state-of-the-art deep network, while controlling the overall accuracy loss, without requiring retraining. The key idea is to combine this model with a (smaller) certified network where at inference time, an adaptive selection mechanism decides on the network used to process the input sample. The approach is compositional: one can combine any pair of state-of-the-art (e.g., EfficientNet or ResNet) and certified networks, without restriction. The resulting architecture enables much higher natural accuracy than previously possible with certified defenses alone, while substantially boosting the certified robustness of deep networks. We demonstrate the effectiveness of this adaptive approach on a variety of datasets and architectures. For instance, on CIFAR-10 with an ∞ perturbation of 2/255, we are the first to obtain a high natural accuracy (90.1%) with non-trivial certified robustness (27.5%). Notably, prior state-of-the-art methods incur a substantial drop in accuracy for a similar certified robustness.

1. INTRODUCTION

Most recent defenses against adversarial examples have been broken by stronger and more adaptive attacks (Athalye et al., 2018; Tramer et al., 2020) , highlighting the importance of investigating certified defenses with suitable robustness guarantees (Raghunathan et al., 2018; Wong & Kolter, 2018; Zhang et al., 2020; Balunović & Vechev, 2020) . And while there has been much progress in developing new certified defenses, a fundamental roadblock to their practical adoption is that they tend to produce networks with an unsatisfying natural accuracy. In this work we propose a novel architecture which brings certified defenses closer to practical use: the architecture enables boosting certified robustness of state-of-the-art deep neural networks without incurring significant accuracy loss and without requiring retraining. Our proposed architecture is compositional and consists of three components: (i) a core-network with high natural accuracy, (ii) a certification-network with high certifiable robustness (need not have high accuracy), and (iii) a selection mechanism that adaptively decides which one of the two networks should process the input sample. The benefit of this architecture is that we can plug in any state-of-the-art deep neural network as a core-network and any certified defense for the certification-network, thus benefiting from any future advances in standard training and certified defenses. A key challenge with certifying the robustness of a decision made by the composed architecture is obtaining a certifiable selection mechanism. Towards that, we propose two different selection mechanisms, one based on an auxiliary selection-network and another based on entropy, and design effective ways to certify both. Experimentally, we demonstrate the promise of this architecture: we are able to train a model with much higher natural accuracy than models trained using prior certified defenses while obtaining non-trivial certified robustness. For example, on the challenging CIFAR-10 dataset with an ∞ perturbation of 2/255, we obtain 90.1% natural accuracy and a certified robustness of 27.5%. On the same task, prior approaches cannot obtain the same natural accuracies for any non-trivial certified robustness.

