DOES ENHANCED SHAPE BIAS IMPROVE NEURAL NETWORK ROBUSTNESS TO COMMON CORRUPTIONS?

Abstract

Convolutional neural networks (CNNs) learn to extract representations of complex features, such as object shapes and textures to solve image recognition tasks. Recent work indicates that CNNs trained on ImageNet are biased towards features that encode textures and that these alone are sufficient to generalize to unseen test data from the same distribution as the training data but often fail to generalize to out-of-distribution data. It has been shown that augmenting the training data with different image styles decreases this texture bias in favor of increased shape bias while at the same time improving robustness to common corruptions, such as noise and blur. Commonly, this is interpreted as shape bias increasing corruption robustness. However, this relationship is only hypothesized. We perform a systematic study of different ways of composing inputs based on natural images, explicit edge information, and stylization. While stylization is essential for achieving high corruption robustness, we do not find a clear correlation between shape bias and robustness. We conclude that the data augmentation caused by style-variation accounts for the improved corruption robustness and increased shape bias is only a byproduct.

1. INTRODUCTION

As deep learning is increasingly applied to open-world perception problems in safety-critical domains such as robotics and autonomous driving, its robustness properties become of paramount importance. Generally, a lack of robustness against adversarial examples has been observed (Szegedy et al., 2014; Goodfellow et al., 2015) , making physical-world adversarial attacks on perception systems feasible (Kurakin et al., 2017; Eykholt et al., 2018; Lee & Kolter, 2019) . In this work, we focus on a different kind of robustness: namely, robustness against naturally occurring common image corruptions. Robustness of image classifiers against such corruptions can be evaluated using the ImageNet-C benchmark (Hendrycks & Dietterich, 2019) , in which corruptions such as noise, blur, weather effects, and digital image transformations are simulated. Hendrycks & Dietterich (2019) observed that recent advances in neural architectures increased performance on undistorted data without significant increase in relative corruption robustness. One hypothesis for the lack of robustness is an over-reliance on non-robust features that generalize well within the distribution used for training but fail to generalize to out-of-distribution data. Ilyas * Equal contribution. 1

