FAIR MIXUP: FAIRNESS VIA INTERPOLATION

Abstract

Training classifiers under fairness constraints such as group fairness, regularizes the disparities of predictions between the groups. Nevertheless, even though the constraints are satisfied during training, they might not generalize at evaluation time. To improve the generalizability of fair classifiers, we propose fair mixup, a new data augmentation strategy for imposing the fairness constraint. In particular, we show that fairness can be achieved by regularizing the models on paths of interpolated samples between the groups. We use mixup, a powerful data augmentation strategy to generate these interpolates. We analyze fair mixup and empirically show that it ensures a better generalization for both accuracy and fairness measurement in tabular, vision, and language benchmarks. The code is available at https://github.com/chingyaoc/fair-mixup.

1. INTRODUCTION

Fairness has increasingly received attention in machine learning, with the aim of mitigating unjustified bias in learned models. Various statistical metrics were proposed to measure the disparities of model outputs and performance when conditioned on sensitive attributes such as gender or race. Equipped with these metrics, one can formulate constrained optimization problems to impose fairness as a constraint. Nevertheless, these constraints do not necessarily generalize since they are data-dependent, i.e they are estimated from finite samples. In particular, models that minimize the disparities on training sets do not necessarily achieve the same fairness metric on testing sets (Cotter et al., 2019) . Conventionally, regularization is required to improve the generalization ability of a model (Zhang et al., 2016) . On one hand, explicit regularization such as weight decay and dropout constrain the model capacity. On the other hand, implicit regularization such as data augmentation enlarge the support of the training distribution via prior knowledge (Hernández-García & König, 2018) . In this work, we propose a data augmentation strategy for optimizing group fairness constraints such as demographic parity (DP) and equalized odds (EO) (Barocas et al., 2019) . Given two sensitive groups such as male and female, instead of directly restricting the disparity, we propose to regularize the model on interpolated distributions between them. Those augmented distributions form a path connecting the two sensitive groups. Figure 1 provides an illustrative example of the idea. The path simulates how the distribution transitions from one group to another via interpolation. Ideally, if the model is invariant to the sensitive attribute, the expected prediction of the model along the path should have a smooth behavior. Therefore, we propose a regularization that favors smooth transitions along the path, which provides a stronger prior on the model class. We adopt mixup (Zhang et al., 2018b) , a powerful data augmentation strategy, to construct the interpolated samples. Owing to mixup's simple form, the smoothness regularization we introduce has a closed form expression that can be easily optimized. One disadvantage of mixup is that the interpolated samples might not lie on the natural data manifold. Verma et al. (2019) propose Manifold Mixup, which generate the mixup samples in a latent space. Previous works (Bojanowski et al., 2018; Berthelot et al., 2018) have shown that interpolations between a pair of latent features correspond to semantically meaningful, smooth interpolation in the input space. By constructing the path in the latent space, we can better capture the semantic changes while traveling between the sensitive groups and hence result in a better fairness regularizer that we coin fair mixup. Empirically, fair

