ADVERSARIAL AND NATURAL PERTURBATIONS FOR GENERAL ROBUSTNESS

Abstract

In this paper we aim to explore the general robustness of neural network classifiers by utilizing adversarial as well as natural perturbations. Different from previous works which mainly focus on studying the robustness of neural networks against adversarial perturbations, we also evaluate their robustness on natural perturbations before and after robustification. After standardizing the comparison between adversarial and natural perturbations, we demonstrate that although adversarial training improves the performance of the networks against adversarial perturbations, it leads to drop in the performance for naturally perturbed samples besides clean samples. In contrast, natural perturbations like elastic deformations, occlusions and wave does not only improve the performance against natural perturbations, but also lead to improvement in the performance for the adversarial perturbations. Additionally they do not drop the accuracy on the clean images.

1. INTRODUCTION

A large body of work in computer vision and machine learning research focuses on studying the robustness of neural networks against adversarial perturbations (Kurakin et al., 2016; Goodfellow et al., 2014; Carlini & Wagner, 2017) . Various defense based methods have also been proposed against these adversarial perturbations ( Goodfellow et al., 2014; Madry et al., 2017; Zhang et al., 2019b; Song et al., 2019) . Concurrently, research also shows that deep neural networks are not even robust to small random perturbations e.g. Gaussian noise, small rotations and translations (Dodge & Karam, 2017; Fawzi & Frossard, 2015; Kanbak et al., 2018) . There is plenty of research being performed in the domain of adversarial perturbations however, there is very little focus on robustifying the networks against natural perturbations as we do here. Furthermore, adversarial perturbations are difficult to be found in the real world, and naturally occuring perturbations are of different nature than these pixel based perturbations. Therefore, in this paper we consider natural perturbations of six different styles that are elastic, occlusion, Gaussian noise, wave, saturation, and Gaussian blur. In this, "elastic" denotes a random sheer transformation applied to the image, "occlusion " is a large randomly located dot in the image and "wave" is a random geometric distortion applied to the image. Additionally, there is no consensus about whether adversarial robustness helps against natural perturbations. Zhang & Zhu (2019) showed that adversarial training reduces texture bias. However, Engstrom et al. (2019) demonstrated that l ∞ based robustness does not generalize to natural transformations like rotations and translations. Here we evaluate whether adversarial training helps against natural perturbations and vice versa. Besides the robustness of the neural networks against natural and adversarial perturbations there is an open debate in the literature about the trade-off between the robustness and the accuracy (Tsipras et al., 2018; Zhang et al., 2019a; Su et al., 2018) . Contrasting with adversarial training we found that networks partially trained with naturally perturbed images does not degrade the classification performance on the clean images. On the CIFAR-10 dataset, we even found that partial training with naturally perturbed images improves the classification accuracy for clean images. Given that deep neural networks are on par in performance with humans or they even surpass humans on clean images however, they fail to perform well on small natural perturbations (He et al., 2016; Dodge & Karam, 2017) . Hendrycks & Dietterich (2019) introduced a subset of Imagenet Deng et al. ( 2009) called Imagenet-C with corruptions applied on images from Imagenet. Although in Imagenet-C each corruption has five severity levels however they are not standardized for comparison among them. We standardize the effect of perturbations on training data to a fixed drop in classification accuracy of the test set, through this we allow for a fair comparison between different styles of training to retain robustness in the classifier against perturbations. We also normalize the performance of the network on different datasets to compare the robustness of the network for different datasets. We conduct 320 experiments on five different datasets for adversarial and six different natural perturbations. General robustness is the most desired case given as, the robustness against perturbations not seen during the training of the classifier. Hence, we evaluate the general robustness of networks by testing them on seen perturbations i.e. when the training and testing type of perturbations is the same, as well as on unseen perturbations i.e. when the training and testing type of perturbations are different. Among classifiers tested on the both seen and unseen perturbations, the natural perturbations of elastic, wave and occlusion come out on top compared to other natural perturbations as well as compared to adversarial perturbations. Our contributions are given as follows: i) We perform fair evaluation of robustness. ii) We show that, natural perturbation robust classifiers generalize to clean images. iii) We depict that, seen natural perturbations are more robust than seen adversarial perturbations. iv) Our evaluation for general robustness shows natural elastic, wave and occlusion perturbations are best robust against unseen perturbations.

2. RELATED WORK

Robustness with Adversarial or Natural Perturbations. In Goodfellow et al. ( 2014) the robustness of neural networks was demonstrated by adding imperceptible i.e. adversarial perturbations in the input to the degree that it will misclassify the input into the wrong class. To solve the problem Carlini & Wagner (2017) proposed adversarial training (AT) procedure that is by training the network on adversarially perturbed images networks can be robustified against these perturbations. In this work we employ a strong yet undefended attack "basic iterative method (BIM)" to generate adversarial examples. "Projected gradient descent (PGD)" a state of the art defense technique for adversarial training to evaluate its effectiveness compared to other ways of robustification. Zhang et al. (2019a); Tsipras et al. (2018) questioned the generalization capability of adversarially trained neural networks on the clean images and showed that with the increase in adversarial robustness the accuracy of the networks on clean images drops. Therefore, we evaluate the performance of adversarially trained networks on clean, adversarial as well as natural perturbations, and compare them with networks trained with natural perturbations. Hendrycks & Dietterich (2019) focused on testing the robustness of vanilla neural networks on 15 different natural perturbations with different perturbation levels. We observe that some of their perturbations are correlated e.g. Gaussian noise, shot noise and impulse noise (Laugros et al., 2019) . While, in our work we train and test on six different natural perturbations covering the breadth of styles of natural perturbations. Furthermore, instead of selecting different perturbation levels randomly we standardize their effect by dropping the accuracy of the network to a fix level for fair comparison among them. Finally, rather than testing vanilla networks, we propose to robustify the networks with natural perturbations and test them for both adversarial and natural perturbations. 2019), proposed natural perturbations based adversarial attacks and showed that testing with only one type of adversarial perturbations does not tell about the complete robustness of the network. We focus on determining the general robustness of neural network classifiers by testing them against unseen adversarial and natural perturbations. Rusak et al. (2020) focus on robustification against natural corruptions besides adversarial perturbations. They utilize Gaussian and speckle noise and show that by augmenting the properly tuned training of a network with Gaussian noise makes it generalizable to unseen natural perturbations. However, in this work we show that elastic, wave and occlusion perturbations surpass the robustness with Gaussian noise. Laugros et al. (2019) study the relationship between adversarial and natural



Robustness with Adversarial and Natural Perturbations. Ford et al. (2019) established the close connections between adversarial robustness and natural perturbations robustness and suggested that adversarial and natural perturbations robustness should go hand in hand and networks should be robustified against both of them. In another similar line of work Kang et al. (2019); Engstrom et al. (

