AGREE TO DISAGREE: DIVERSITY THROUGH DIS-AGREEMENT FOR BETTER TRANSFERABILITY

Abstract

Gradient-based learning algorithms have an implicit simplicity bias which in effect can limit the diversity of predictors being sampled by the learning procedure. This behavior can hinder the transferability of trained models by (i) favoring the learning of simpler but spurious features -present in the training data but absent from the test data -and (ii) by only leveraging a small subset of predictive features. Such an effect is especially magnified when the test distribution does not exactly match the train distribution-referred to as the Out of Distribution (OOD) generalization problem. However, given only the training data, it is not always possible to apriori assess if a given feature is spurious or transferable. Instead, we advocate for learning an ensemble of models which capture a diverse set of predictive features. Towards this, we propose a new algorithm D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data, but disagreement on the OOD data. We show how D-BAT naturally emerges from the notion of generalized discrepancy, as well as demonstrate in multiple experiments how the proposed method can mitigate shortcut-learning, enhance uncertainty and OOD detection, as well as improve transferability.

1. INTRODUCTION

While gradient-based learning algorithms such as Stochastic Gradient Descent (SGD), are nowadays ubiquitous in the training of Deep Neural Networks (DNNs), it is well known that the resulting models are (i) brittle when exposed to small distribution shifts (Beery et al., 2018; Sun et al., 2016; Amodei et al., 2016) , (ii) can easily be fooled by small adversarial perturbations (Szegedy et al., 2014) , (iii) tend to pick up spurious correlations (McCoy et al., 2019; Oakden-Rayner et al., 2020; Geirhos et al., 2020) -present in the training data but absent from the downstream task -, as well as (iv) fail to provide adequate uncertainty estimates (Kim et al., 2016; van Amersfoort et al., 2020; Liu et al., 2021b) . Recently those learning algorithms have been investigated for their implicit bias toward simplicity -known as Simplicity Bias (SB), seen as one of the reasons behind their superior generalization properties (Arpit et al., 2017; Dziugaite & Roy, 2017) . While for deep neural networks, simpler decision boundaries are often seen as less likely to overfit, Shah et al. (2020 ), Pezeshki et al. (2021) demonstrated that the SB can still cause the aforementioned issues. In particular, they show how the SB can be extreme, compelling predictors to rely only on the simplest feature available, despite the presence of equally or even more predictive complex features. Its effect is greatly increased when we consider the more realistic out of distribution (OOD) setting (Ben-Tal et al., 2009) , in which the source and target distributions are different, known to be a challenging problem (Sagawa et al., 2020; Krueger et al., 2021) . The difference between the two domains can be categorized into either a distribution shift -e.g. a lack of samples in certain parts of the data manifold due to limitations of the data collection pipeline -, or as simply having completely different distributions. In the first case, the SB in its extreme form would increase the chances of learning to rely on spurious features -shortcuts not generalizing to the target distribution. Classic manifestations of this in vision applications are when models learn to rely mostly on textures or backgrounds instead of more complex and likely more generalizable semantic features such as using shapes (Beery et al., 2018; Ilyas et al., 2019; Geirhos et al., 2020) . In the second instance, by relying only on the simplest feature, and being invariant to more complex ones, the SB would cause confident predictions (low uncertainty) on completely OOD samples. This even if complex features are contradicting simpler ones. Which brings us to our goal of deriving a method which can (i) learn more transferable features, better suited to generalize despite distribution shifts, and (ii) provides accurate uncertainty estimates also for OOD samples. We aim to achieve those two objectives through learning an ensemble of diverse predictors (h 1 , . . . , h K ), with h : X → Y, and K being the ensemble size. Suppose that our training data is drawn from the distribution D, and D ood is the distribution of OOD data on which we will be tested. Importantly, D and D ood may have non-overlapping support, and D ood is not known during training. Our proposed method, D-BAT (Diversity-By-disAgreement Training), relies on the following idea: Diverse hypotheses should agree on the source distribution D while disagreeing on the OOD distribution D ood . Intuitively, a set of hypotheses should agree on what is known i.e. on D, while formulating different interpretations of what is not known, i.e. on D ood . Even if each individual predictor might be wrongly confident on OOD samples, while predicting different outcomes -the resulting uncertainty of the ensemble on those samples will be increased. Disagreement on D ood can itself be enough to promote learning diverse representations of instances of D. In the context of object detection, if one model h 1 is relying on textures only, this model will generate predictions on D ood based on textures, when enforcing disagreement on D ood , a second model h 2 would be discouraged to use textures in order to disagree with h 1 -and consequently look for a different hypothesis to classify instances of D e.g. using shapes. This process is illustrated in Fig. 2 . A 2D direct application of our algorithm can be seen in Fig. 1 . Once trained, the ensemble can either be used by forming a weighted average of the probability distribution from each hypothesis, or-if given some labeled data from the downstream task-by selecting one particular hypothesis. Contributions. Our results can be summarized as: • We introduce D-BAT, a simple yet efficient novel diversity-inducing regularizer which enables training ensembles of diverse predictors. • We provide a proof, in a simplified setting, that D-BAT promotes diversity, encouraging the models to utilize different predictive features. • We show on several datasets of varying complexity how the induced diversity can help to (i) tackle shortcut learning, and (ii) improve uncertainty estimation and transferability.



Figure 1: Example of applying D-BAT on a simple 2D toy example similar to the LMS-5 dataset introduced by Shah et al. (2020). The two classes, red and blue, can easily be separated by a vertical boundary decision. Other ways to separate the two classes -with horizontal lines for instanceare more complex., i.e. they require more hyperplanes. The simplicity bias will push models to systematically learn the simpler feature, as in the second column (b). Using D-BAT, we are able to learn the model in column (c), relying on a more complex boundary decision, effectively overcoming the simplicity bias. The ensemble h ens (x) = h 1 (x) + h 2 (x), in column (d), outputs a flat distribution at points where the two models disagree, effectively maximizing the uncertainty at those points. In this experiments the samples from D ood were obtained through computing adversarial perturbations, see App. D.2 for more details.

