ADVERSARIAL AND NATURAL PERTURBATIONS FOR GENERAL ROBUSTNESS

Abstract

In this paper we aim to explore the general robustness of neural network classifiers by utilizing adversarial as well as natural perturbations. Different from previous works which mainly focus on studying the robustness of neural networks against adversarial perturbations, we also evaluate their robustness on natural perturbations before and after robustification. After standardizing the comparison between adversarial and natural perturbations, we demonstrate that although adversarial training improves the performance of the networks against adversarial perturbations, it leads to drop in the performance for naturally perturbed samples besides clean samples. In contrast, natural perturbations like elastic deformations, occlusions and wave does not only improve the performance against natural perturbations, but also lead to improvement in the performance for the adversarial perturbations. Additionally they do not drop the accuracy on the clean images.

1. INTRODUCTION

A large body of work in computer vision and machine learning research focuses on studying the robustness of neural networks against adversarial perturbations (Kurakin et al., 2016; Goodfellow et al., 2014; Carlini & Wagner, 2017) . Various defense based methods have also been proposed against these adversarial perturbations ( Goodfellow et al., 2014; Madry et al., 2017; Zhang et al., 2019b; Song et al., 2019) . Concurrently, research also shows that deep neural networks are not even robust to small random perturbations e.g. Gaussian noise, small rotations and translations (Dodge & Karam, 2017; Fawzi & Frossard, 2015; Kanbak et al., 2018) . There is plenty of research being performed in the domain of adversarial perturbations however, there is very little focus on robustifying the networks against natural perturbations as we do here. Furthermore, adversarial perturbations are difficult to be found in the real world, and naturally occuring perturbations are of different nature than these pixel based perturbations. Therefore, in this paper we consider natural perturbations of six different styles that are elastic, occlusion, Gaussian noise, wave, saturation, and Gaussian blur. In this, "elastic" denotes a random sheer transformation applied to the image, "occlusion " is a large randomly located dot in the image and "wave" is a random geometric distortion applied to the image. Additionally, there is no consensus about whether adversarial robustness helps against natural perturbations. Zhang & Zhu (2019) showed that adversarial training reduces texture bias. However, Engstrom et al. ( 2019) demonstrated that l ∞ based robustness does not generalize to natural transformations like rotations and translations. Here we evaluate whether adversarial training helps against natural perturbations and vice versa. Besides the robustness of the neural networks against natural and adversarial perturbations there is an open debate in the literature about the trade-off between the robustness and the accuracy (Tsipras et al., 2018; Zhang et al., 2019a; Su et al., 2018) . Contrasting with adversarial training we found that networks partially trained with naturally perturbed images does not degrade the classification performance on the clean images. On the CIFAR-10 dataset, we even found that partial training with naturally perturbed images improves the classification accuracy for clean images. Given that deep neural networks are on par in performance with humans or they even surpass humans on clean images however, they fail to perform well on small natural perturbations (He et al., 2016; Dodge & Karam, 2017) 



. Hendrycks & Dietterich (2019) introduced a subset of Imagenet Deng et al. (2009) called Imagenet-C with corruptions applied on images from Imagenet. Although

