DOES ENHANCED SHAPE BIAS IMPROVE NEURAL NETWORK ROBUSTNESS TO COMMON CORRUPTIONS?

Abstract

Convolutional neural networks (CNNs) learn to extract representations of complex features, such as object shapes and textures to solve image recognition tasks. Recent work indicates that CNNs trained on ImageNet are biased towards features that encode textures and that these alone are sufficient to generalize to unseen test data from the same distribution as the training data but often fail to generalize to out-of-distribution data. It has been shown that augmenting the training data with different image styles decreases this texture bias in favor of increased shape bias while at the same time improving robustness to common corruptions, such as noise and blur. Commonly, this is interpreted as shape bias increasing corruption robustness. However, this relationship is only hypothesized. We perform a systematic study of different ways of composing inputs based on natural images, explicit edge information, and stylization. While stylization is essential for achieving high corruption robustness, we do not find a clear correlation between shape bias and robustness. We conclude that the data augmentation caused by style-variation accounts for the improved corruption robustness and increased shape bias is only a byproduct.

1. INTRODUCTION

As deep learning is increasingly applied to open-world perception problems in safety-critical domains such as robotics and autonomous driving, its robustness properties become of paramount importance. Generally, a lack of robustness against adversarial examples has been observed (Szegedy et al., 2014; Goodfellow et al., 2015) , making physical-world adversarial attacks on perception systems feasible (Kurakin et al., 2017; Eykholt et al., 2018; Lee & Kolter, 2019) . In this work, we focus on a different kind of robustness: namely, robustness against naturally occurring common image corruptions. Robustness of image classifiers against such corruptions can be evaluated using the ImageNet-C benchmark (Hendrycks & Dietterich, 2019) , in which corruptions such as noise, blur, weather effects, and digital image transformations are simulated. Hendrycks & Dietterich (2019) observed that recent advances in neural architectures increased performance on undistorted data without significant increase in relative corruption robustness. One hypothesis for the lack of robustness is an over-reliance on non-robust features that generalize well within the distribution used for training but fail to generalize to out-of-distribution data. Ilyas 2019) provide evidence for this hypothesis on adversarial examples. Similarly, it has been hypothesized that models which rely strongly on texture information are more vulnerable to common corruptions than models based on features encoding shape information (Geirhos et al., 2019; Hendrycks & Dietterich, 2019) . Alternative methods for increasing corruption robustness not motivated by enhancing shape bias use more (potentially unlabeled) training data (Xie et al., 2019) or use stronger data augmentation (Lopes et al., 2019; Hendrycks* et al., 2020) . Note that our meaning of "shape" & "texture" is built on the definitions by Geirhos et al. (2019) . In this paper, we re-examine the question of whether increasing the shape bias of a model actually helps in terms of corruption robustness. While prior work has found that there are training methods that increase both shape bias and corruption robustness (Geirhos et al., 2019; Hendrycks & Dietterich, 2019) , this only establishes a correlation and not a causal relationship. To increase the shape bias, Geirhos et al. ( 2019) "stylize" images by imposing the style of a painting onto the image, leaving the shape-related structure of the image mostly unchanged while modifying texture cues so that they get largely uninformative of the class. Note that image stylization can be interpreted as a specific form of data augmentation, providing an alternative hypothesis for increased corruption robustness which would leave the enhanced shape bias as a mostly unrelated byproduct. In this work, we investigate the role of the shape bias for corruption robustness in more detail. We propose two novel methods for increasing the shape bias: • Similar to Geirhos et al. ( 2019), we pre-train the CNN on an auxiliary dataset which encourages learning shape features. In contrast to Geirhos et al. ( 2019) that use stylized images, this dataset consists of the edge maps for the training images that are generated using the pre-trained neural network of Liu et al. ( 2017) for edge detection. This method maintains global object shapes but removes texture-related information, thereby encouraging learning shape-based representations. • In addition to pre-training on edge maps, we also propose style randomization to further enhance the shape bias. Style randomization is based upon sampling parameters of the affine transformations of normalization layers for each input from a uniform distribution. Our key finding is summarized in Figure 1 . While pre-training on stylized images increases both shape bias and corruption robustness, these two quantities are not necessarily correlated: pre-training on edge maps increases the shape bias without consistently helping in terms of corruption robustness. In order to explain this finding, we conduct a systematic study in which we create inputs based on natural images, explicit edge information, and different ways of stylization (see Figure 2 for an illustration). We find that the shape bias gets maximized when combining edge information with stylization without including any texture information (Stylized Edges). However, for maximal corruption robustness, superimposing the image (and thus its textures) on these stylized edges is required. This, however, strongly reduces shape bias. In summary, corruption robustness seems to benefit most from style variation in the vicinity of the image manifold, while shape bias is mostly



Figure 1: Illustration of the effect of different training augmentations. While both style-based(Geirhos et al., 2019)  and edge-based augmentation (this paper) reach the same validation accuracy, edge-based augmentation shows a stronger increase in shape bias as evidenced by lower accuracy on patch-shuffled images and higher rate of classifying according to the shape category for textureshape cue conflicts. Nevertheless, only style-based augmentation shows a considerable improvement against common corruptions such as Gaussian blur. This challenges the hypothesis that increased shape bias causes improved robustness to corruption.

