EXCHANGING LESSONS BETWEEN ALGORITHMIC FAIRNESS AND DOMAIN GENERALIZATION Anonymous

Abstract

Standard learning approaches are designed to perform well on average for the data distribution available at training time. Developing learning approaches that are not overly sensitive to the training distribution is central to research on domainor out-of-distribution generalization, robust optimization and fairness. In this work we focus on links between research on domain generalization and algorithmic fairness-where performance under a distinct but related test distributions is studied-and show how the two fields can be mutually beneficial. While domain generalization methods typically rely on knowledge of disjoint "domains" or "environments", "sensitive" label information indicating which demographic groups are at risk of discrimination is often used in the fairness literature. Drawing inspiration from recent fairness approaches that improve worst-case performance without knowledge of sensitive groups, we propose a novel domain generalization method that handles the more realistic scenario where environment partitions are not provided. We then show theoretically and empirically how different partitioning schemes can lead to increased or decreased generalization performance, enabling us to outperform Invariant Risk Minimization with handcrafted environments in multiple cases. We also show how a re-interpretation of IRMv1 allows us for the first time to directly optimize a common fairness criterion, groupsufficiency, and thereby improve performance on a fair prediction task.

1. INTRODUCTION

Machine learning achieves super-human performance on many tasks when the test data is drawn from the same distribution as the training data. However, when the two distributions differ, model performance can severely degrade to even below chance predictions (Geirhos et al., 2020) . Tiny perturbations can derail classifiers, as shown by adversarial examples (Szegedy et al., 2014) and common-corruptions in image classification (Hendrycks & Dietterich, 2019) . Even new test sets collected from the same data acquisition pipeline induce distribution shifts that significantly harm performance (Recht et al., 2019; Engstrom et al., 2020) . Many approaches have been proposed to overcome model brittleness in the face of input distribution changes. Robust optimization aims to achieve good performance on any distribution close to the training distribution (Goodfellow et al., 2015; Duchi et al., 2016; Madry et al., 2018) . Domain generalization on the other hand tries to go one step further, to generalize to distributions potentially far away from the training distribution. The field of algorithmic fairness meanwhile primarily focuses on developing metrics to track and mitigate performance differences between different sub-populations or across similar individuals (Dwork et al., 2012; Corbett-Davies & Goel, 2018; Chouldechova & Roth, 2018) . Like domain generalization, evaluation using data related to but distinct from the training set is needed to characterize model failure. These evaluations are curated through the design of audits, which play a central role in revealing unfair algorithmic decision making (Buolamwini & Gebru, 2018; Obermeyer et al., 2019) . While the ultimate goals of domain generalization and algorithmic fairness are closely aligned, little research has focused on their similarities and how they can inform each other constructively. One of their main common goals can be characterized as: Learning algorithms robust to changes across domains or population groups. The following contributions show how lessons can be exchanged from the two fields:

Method

• We draw several connections between the goals of domain generalization and those of algorithmic fairness, suggesting fruitful research directions in both fields (Section 2). • Drawing inspiration from recent methods on inferring worst-case sensitive groups from data, we propose a novel domain generalization algorithm-Environment Inference for Invariant Learning (EIIL)-for cases where training data does not include environment partition labels (Section 3). Our method outperforms IRM on the domain generalization benchmark ColorMNIST without access to environment labels (Section 4). • We also show that IRM, originally developed for domain generalization tasks, affords a differentiable regularizer for the fairness notion of group sufficiency, which was previously hard to optimize for non-convex losses. On a variant of the UCI Adult dataset where confounding bias is introduced, we leverage this insight with our method EIIL to improve group sufficiency without knowledge of sensitive groups, ultimately improving generalization performance for large distribution shifts compared with a baseline robust optimization method (Section 4). • We characterize both theoretically and empirically the limitations of our proposed method, concluding that while EIIL can correct a baseline ERM solution that uses a spurious feature or "shortcut" for prediction, it is not suitable for all settings (Sections 3 and 4).

2. DOMAIN GENERALIZATION AND ALGORITHMIC FAIRNESS

Here we lay out some connections between the two fields. Table 2 provides a high-level comparison of the objectives and assumptions of several relevant methods. Loosely speaking, recent approaches from both areas share the goal of matching some chosen statistic across a conditioning variable e, representing sensitive group membership in algorithmic fairness or an environment/domain indicator in domain generalization. The statistic in question informs the learning objective for the resulting model, and is motivated differently in each case. In domain generalization, learning is informed by the properties of the test distribution where good generalization should be achieved. In algorithmic fairness the choice of statistic is motivated by a context-specific fairness notion, that likewise encourages a particular solution that achieves "fair" outcomes (Chouldechova & Roth, 2018) . Empty spaces in Table 2 suggest areas for future work, and bold-faced entries suggest connections we show in this paper.



Results on CMNIST, a digit classification task where color is a spurious feature correlated with the label during training but anti-correlated at test time. Our method Environment Inference for Invariant Learning (EIIL), taking inspiration from recent themes in the fairness literature, augments IRM to improve test set performance without knowledge of pre-specified environment labels, by instead finding worst-case environments using aggregated data and a reference classifier.Achieving this not only allows models to generalize to different and unobserved but related distributions, it also mitigates unequal treatment of individuals solely based on group membership.In this work we explore independently developed concepts from the domain generalization and fairness literatures and exchange lessons between them to motivate new methodology for both fields. Inspired by fairness approaches for unknown group memberships(Kim et al., 2019; Hébert-Johnson  et al., 2018; Lahoti et al., 2020), we develop a new domain generalization method that does not require domain identifiers and yet can outperform manual specification of domains (Table1). Leveraging domain generalization insights in a fairness context, we show the regularizer from IRMv1 (Arjovsky et al., 2019) optimizes a fairness criterion termed "group-sufficiency" which for the first time enables us to explicitly optimize this criterion for non-convex losses in fair classification.

