THE RISKS OF INVARIANT RISK MINIMIZATION

Abstract

Invariant Causal Prediction (Peters et al., 2016) is a technique for out-of-distribution generalization which assumes that some aspects of the data distribution vary across the training set but that the underlying causal mechanisms remain constant. Recently, Arjovsky et al. (2019) proposed Invariant Risk Minimization (IRM), an objective based on this idea for learning deep, invariant features of data which are a complex function of latent variables; many alternatives have subsequently been suggested. However, formal guarantees for all of these works are severely lacking. In this paper, we present the first analysis of classification under the IRM objective-as well as these recently proposed alternatives-under a fairly natural and general model. In the linear case, we give simple conditions under which the optimal solution succeeds or, more often, fails to recover the optimal invariant predictor. We furthermore present the very first results in the non-linear regime: we demonstrate that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution-this is precisely the issue that it was intended to solve. Thus, in this setting we find that IRM and its alternatives fundamentally do not improve over standard Empirical Risk Minimization.

1. INTRODUCTION

Prediction algorithms are evaluated by their performance on unseen test data. In classical machine learning, it is common to assume that such data are drawn i.i.d. from the same distribution as the data set on which the learning algorithm was trained-in the real world, however, this is often not the case. When this discrepancy occurs, algorithms with strong in-distribution generalization guarantees, such as Empirical Risk Minimization (ERM), can fail catastrophically. In particular, while deep neural networks achieve superhuman performance on many tasks, there is evidence that they rely on statistically informative but non-causal features in the data (Beery et al., 2018; Geirhos et al., 2018; Ilyas et al., 2019) . As a result, such models are prone to errors under surprisingly minor distribution shift (Su et al., 2019; Recht et al., 2019) . To address this, researchers have investigated alternative objectives for training predictors which are robust to possibly egregious shifts in the test distribution. The task of generalizing under such shifts, known as Out-of-Distribution (OOD) Generalization, has led to many separate threads of research. One approach is Bayesian deep learning, accounting for a classifier's uncertainty at test time (Neal, 2012) . Another technique that has shown promise is data augmentation-this includes both automated data modifications which help prevent overfitting (Shorten & Khoshgoftaar, 2019) and specific counterfactual augmentations to ensure invariance in the resulting features (Volpi et al., 2018; Kaushik et al., 2020) . A strategy which has recently gained particular traction is Invariant Causal Prediction (ICP; Peters et al. 2016), which views the task of OOD generalization through the lens of causality. This framework assumes that the data are generated according to a Structural Equation Model (SEM; Bollen 2005), which consists of a set of so-called mechanisms or structural equations that specify variables given their parents. ICP assumes moreover that the data can be partitioned into environments, where each environment corresponds to interventions on the SEM (Pearl, 2009) , but where the mechanism by which the target variable is generated via its direct parents is unaffected. Thus the causal mechanism of the target variable is unchanging but other aspects of the distribution can vary broadly. As a result, learning mechanisms that are the same across environments ensures recovery of the invariant features which generalize under arbitrary interventions. In this work, we consider objectives that attempt to

