BALANCING ROBUSTNESS AND SENSITIVITY USING FEATURE CONTRASTIVE LEARNING

Abstract

It is generally believed that robust training of extremely large networks is critical to their success in real-world applications. However, when taken to the extreme, methods that promote robustness can hurt the model's sensitivity to rare or underrepresented patterns. In this paper, we discuss this trade-off between robustness and sensitivity by introducing two notions: contextual feature utility and contextual feature sensitivity. We propose Feature Contrastive Learning (FCL) that encourages the model to be more sensitive to the features that have higher contextual utility. Empirical results demonstrate that models trained with FCL achieve a better balance of robustness and sensitivity, leading to improved generalization in the presence of noise.

1. INTRODUCTION

Deep learning has shown unprecedented success in numerous domains (Krizhevsky et al., 2012; Szegedy et al., 2015; He et al., 2016; Hinton et al., 2012; Sutskever et al., 2014; Devlin et al., 2018) , and robustness plays a key role in the success of neural networks. When we seek robustness, we are interested in having the same model prediction for small perturbations of the inputs. However such invariance to small perturbations can prove detrimental in some cases. As an extreme example, it is sometimes possible that a small perturbation to the input changes the human perceived class label, but the model is insensitive to this change (Tramèr et al., 2020) . In this paper, we focus on balancing this tradeoff between robustness and sensitivity by developing a contrastive learning method that promotes the change in model prediction for certain perturbations, and inhibits the change for certain other perturbations. Note that we are only referring to non-adversarial robustness in this paper, i.e., we are not making any effort to improve robustness to carefully designed adversarial perturbations (Goodfellow et al., 2014) . To develop algorithms that balance robustness and sensitivity, we first formalize two measures: utility and sensitivity. Utility refers to the change in the loss function when we perturb a specific input feature. In other words, whether an input feature is useful for the model's prediction. Sensitivity, on the other hand, is the change in the learned embedding representation (before computing the loss) when we perturb a specific input feature. In contrast to classical feature selection approaches (Guyon & Elisseeff, 2003; Yu & Liu, 2004 ) that identify relevant and important features, our notions of sensitivity and utility are context dependent and change from one image to another. Our goal is to ensure that if an input feature has high utility, the model will also be sensitive to it, and if it has low utility then the model won't. To explore and illustrate the notions of utility and sensitivity, we introduce a synthetic MNIST dataset, as shown in Figure 1 . In the standard MNIST, the goal is to classify 10 digits based on their appearance. We modify it by adding a small random digit in the corner of some of the images and increasing the number of classes by five. For digits 5-9 we never change the class labels even in the presence of a corner digit, whereas digits 0-4 move to extended class labels 10-14 in the presence of any corner digit. The small corner digits can have high or low utility depending on the context. If the digit in the center is in 5-9 the corner digit has no bearing on the class, and will have low utility. However, if the digit in the center of the image is in 0-4, the presence of a corner digit is essential to determining the label, and thus has high utility. We would like to promote model sensitivity to the small corner digits when they are informative, in order to improve predictions, but demote it when they are not, in order to improve robustness.

