REVISITING ACTIVATION FUNCTION DESIGN FOR IMPROVING ADVERSARIAL ROBUSTNESS AT SCALE Anonymous

Abstract

Modern ConvNets typically use ReLU activation function. Recently smooth activation functions have been used to improve their accuracy. Here we study the role of smooth activation function from the perspective of adversarial robustness. We find that ReLU activation function significantly weakens adversarial training due to its non-smooth nature. Replacing ReLU with its smooth alternatives allows adversarial training to find harder adversarial training examples and to compute better gradient updates for network optimization. We focus our study on the large-scale ImageNet dataset. On ResNet-50, switching from ReLU to the smooth activation function SILU improves adversarial robustness from 33.0% to 42.3%, while also improving accuracy by 0.9% on Ima-geNet. Smooth activation functions also scale well with larger networks: it helps EfficientNet-L1 to achieve 82.2% accuracy and 58.6% robustness, largely outperforming the previous state-of-the-art defense by 9.5% for accuracy and 11.6% for robustness.

1. INTRODUCTION

It is known that convolutional neural networks can be easily fooled by adversarial examples (Szegedy et al., 2014) . To improve robustness, many efforts have been made (Papernot et al., 2016; Guo et al., 2018; Xie et al., 2018; Liu et al., 2018; Pang et al., 2019; Schott et al., 2019) ; while adversarial training (Goodfellow et al., 2015; Kurakin et al., 2017; Madry et al., 2018) , which trains networks with adversarial examples on-the-fly, stands as one of the most effective methods. Later studies further improve adversarial training by feeding networks with harder adversarial examples (Wang et al., 2019b) , maximizing the margin of networks (Ding et al., 2020) , optimizing a regularized surrogate loss (Zhang et al., 2019) , etc. While these methods achieve stronger adversarial robustness, they sacrifice accuracy on clean inputs. It is generally believed such trade-off between accuracy and robustness might be inevitable (Tsipras et al., 2019) , except for enlarging network capacities, e.g., making wider or deeper networks (Madry et al., 2018; Xie & Yuille, 2020) . Another popular direction for increasing robustness against adversarial attacks is gradient masking (Papernot et al., 2017; Xie et al., 2018; Samangouei et al., 2018; Song et al., 2018; Ma et al., 2018; Guo et al., 2018) . With the degenerated gradient quality, attackers cannot successfully optimize the targeted loss and therefore fail to circumvent such defenses. Nonetheless, the gradient masking operation will be ineffective to offer robustness if its differentiable approximation is used for generating adversarial examples (Athalye et al., 2018) . To effectively build robust models, we hereby rethink the relationship between gradient quality and adversarial robustness, especially in the context of adversarial training where gradients are applied more frequently than standard training. In addition to computing gradients to update network parameters, adversarial training also requires gradient computation for generating training samples. Guided by this principle, we identify ReLU, a widely-used activation function in modern ConvNets, significantly weakens adversarial training due to its non-smooth nature, e.g., ReLU's gradient gets an abrupt change (from 0 to 1) when its input is close to zero (see Figure 1 ). In this paper, we revisit the activation function design for improving adversarial robustness, with a special focus on the large-scale ImageNet dataset (Russakovsky et al., 2015) . To fix the issue

