OVERPARAMETERISATION AND WORST-CASE GENER-ALISATION: FRIEND OR FOE?

Abstract

Overparameterised neural networks have demonstrated the remarkable ability to perfectly fit training samples, while still generalising to unseen test samples. However, several recent works have revealed that such models' good average performance does not always translate to good worst-case performance: in particular, they may perform poorly on subgroups that are under-represented in the training set. In this paper, we show that in certain settings, overparameterised models' performance on under-represented subgroups may be improved via post-hoc processing. Specifically, such models' bias can be restricted to their classification layers, and manifest as structured prediction shifts for rare subgroups. We detail two post-hoc correction techniques to mitigate this bias, which operate purely on the outputs of standard model training. We empirically verify that with such post-hoc correction, overparameterisation can improve average and worst-case performance.

1. INTRODUCTION

Overparameterised neural networks have demonstrated the remarkable ability to perfectly fit training samples, while still generalising to unseen test samples (Zhang et al., 2017; Neyshabur et al., 2019; Nakkiran et al., 2020) . However, several recent works have revealed that overparameterised models' good average performance does not translate to good worst-case performance (Buolamwini & Gebru, 2018; Hashimoto et al., 2018; Sagawa et al., 2020a; b) . In particular, the test performance of such models may be poor on certain subgroups that are under-represented in the training data. Worse still, such degradation can be exacerbated as model complexity increases. This indicates the unsuitability of such models in ensuring fairness across subgroups, a topical concern given the growing societal uses of machine learning (Dwork et al., 2012; Hardt et al., 2016; Buolamwini & Gebru, 2018) . Why does overparameterisation induce such unfavourable bias, and how can one correct for it? Sagawa et al. ( 2020a) demonstrated how such models may fit to spurious correlations that explain under-represented samples, which can generalise poorly. Sagawa et al. (2020b) further posited that overparameterised models have an inductive bias towards memorising labels for as few samples as possible, which are invariably those from under-represented subgroups. To mitigate such bias, existing approaches include subsampling majority subgroups (Sagawa et al., 2020b) , and modifying the training objective (Sagawa et al., 2020a; Nam et al., 2020; Zhang et al., 2020; Goel et al., 2020) . This suggests two important points regarding overparameterised models' performance: In this paper, we establish that while overparameterised models are biased against under-represented examples, in certain settings, such bias may be easily corrected via post-hoc processing of the model outputs. Specifically, such models' bias can be largely restricted to their classification layers, and manifest as structured shifts in predictions for rare subgroups. We thus show how two simple techniques applied to the model outputs -classifier retraining based on the learned representations, and correction of the classification threshold -can help overparameterised models improve worstsubgroup performance over underparameterised counterparts. Consequently, even with standard training, overparameterised models can learn sufficient information to model rare subgroups. To make the above concrete, Figure 1 plots a histogram of model predictions for a synthetic dataset from Sagawa et al. (2020b) (cf. §2). The data comprises four subgroups generated from combinations (y, a(x)) of labels y ∈ {±1} and a feature a(x) ∈ {±1}. Most samples (x, y) have y = a(x), and so these comprise two dominant subgroups within the positive and negative samples. We train an overparameterised linear model, yielding logits f ±1 (x). We then plot the decision scores f +1 (x) -f -1 (x), which are expected to be > 0 iff y = +1. Strikingly, there is a distinct separation amongst the subgroup scores: e.g., samples with y = +1, a(x) = -1 have systematically lower scores than those with y = +1, a(x) = +1. Consequently, the model incurs a significant error rate on rare subgroups. The structured nature of the separation implies suggests to post-hoc shift the scores to align the distributions; this markedly improves performance on the rare subgroups (Figure 1b ). Scope and contributions. The primary aim of this work is furthering the understanding of the behaviour of overparameterised models, rather than proposing new techniques. Indeed, the post-hoc correction techniques we employ have been well-studied in the related problem setting of longtail learning or learning under class imbalance (He & Garcia, 2009; Buda et al., 2017; Van Horn & Perona, 2017) . Several works have demonstrated that the representations learned by standard networks contain sufficient information to distinguish between dominant and rare labels (Liu et al., 2019; Zhang et al., 2019; Kang et al., 2020; Menon et al., 2020) . Similar techniques are also common the fairness literature (Hardt et al., 2016; Chzhen et al., 2019) . However, it is not a-priori clear whether such techniques are effective for overparameterised models, whose ability to perfectly fit the training labels can thwart otherwise effective approaches (Sagawa et al., 2020a) . Existing techniques for improving the worst-subgroup error of overparameterised models involve altering the inputs to the model (Sagawa et al., 2020b) , or the training objective (Sagawa et al., 2020a) . By contrast, the techniques we study alter the outputs of a standard network, trained to minimise the softmax cross-entropy on the entire data. Our findings illustrate that such models do not necessarily require bespoke training modifications to perform well on rare subgroups: even with standard training, overparameterised models can (in certain settings) learn useful information about rare subgroups. In summary, our contributions are: (i) we demonstrate that, in certain settings, overparameterised models' poor performance on underrepresented subgroups is the result of a structured bias in the classification layer (cf. §3); (ii) we show that two simple post-hoc correction procedures (cf. §4) can mitigate the above bias, and thus significantly reduce their worst-subgroup error (cf. §5).

2. BACKGROUND AND SETTING

Suppose we have a labelled training sample S = {(x i , y i )} n i=1 ∈ (X × Y) n , for instance space X ⊂ R d and label space Y. One typically assumes S is an i.i.d. draw from some unknown distribution P(x, y). Further, suppose each (x, y) has an associated group membership g(x, y) ∈ G, with



(a) with standard training, increasing model complexity exacerbates degradation on rare subgroups; (b) controlling this degradation may require alternate training objectives or procedures.

Effect of post-hoc correction.

Figure 1: Distribution of model decision scores on test samples from a synthetic dataset of Sagawa et al. (2020b), comprising two labels with two subgroups each (left panel). The scores are expected to be > 0 iff the label is positive. We train a linear model with complexity controlled by its number of features m. For the overparameterised setting m = 10 4 (left), within each class, rare subgroups consistently appear on the wrong side of the decision boundary. Correcting this bias via post-hoc score translation improves the worst-subgroup error as the model complexity is increased (right).

