LONG TERM FAIRNESS VIA PERFORMATIVE DISTRI-BUTIONALLY ROBUST OPTIMIZATION

Abstract

Fairness researchers in machine learning (ML) have coalesced around several fairness criteria which provide formal definitions of what it means for an ML model to be fair. However, these criteria have some serious limitations. We identify four key shortcomings of these formal fairness criteria and address them by extending performative prediction to include a distributionally robust objective. Performative prediction is a recent framework developed to understand the effects of when deploying a model influences the distribution on which it is making predictions. We prove a convergence result for our proposed repeated distributionally robust optimization (RDRO). We further verify our results empirically and develop experiments to demonstrate the impact of using RDRO on learning fair ML models.

1. INTRODUCTION

In the past two decades, machine learning (ML) has moved from the confines of research institutes and university laboratories to become a core element of the global economy. ML models are now deployed at enormous scales in complex environments, often making high stakes decision. Too often, however, this is done without adequate concern for the fairness and robustness of these ML models. Fairness in ML is a burgeoning research area, but much of the work in fairness, particularly in defining formal fairness criteria, has been limited to the static classification setting. 2017). These formal fairness criteria assume a sensitive characteristic or protected demographic group for whom we want to ensure our model is non-discriminatory. The fairness criteria are then properties of the joint distribution of this characteristic, the output of the classifier, and the true labels of the data. While these fairness definitions have been a useful starting point in the consideration of discrimination by ML models, they have several limitations. 1. They are not equivalent and, in most scenarios, they are incompatible. 2. They apply only to static supervised learning problems and ignore the dynamic environments characteristic of many real world scenarios with fairness concerns. 3. They rely on having access to demographic information. The definitions can only be used if one has access to the sensitive characteristic, which is often not the case. 4. They ignore intersectionality. The criteria do not take into account individuals who may sit at the intersection of several sensitive demographic groups. Fairness is fundamentally a philosophical and political question, and the notion of having a single, universal formal definition of fairness for ML is likely naïve. For this reason, this work does not attempt to formally define fairness and largely ignores the first problem noted above. We do, however, attempt to address issues 2, 3, and 4 by drawing upon two recent areas of research with implications for fairness in ML: performative prediction and distributionally robust optimization (DRO). DRO offers a compelling and flexible method for training non-discriminatory algorithms without needing access to demographic information. Performative prediction, on the other hand, attempts to outline a theoretical framework through which we can reason about ML models in dynamic environments, when the act of deploying a model influences the distribution on which it is making decisions.



Efforts to define fairness in ML have resulted in myriad criteria being proposed, many of which are equivalent to, or relaxations of, three core definitions of fairness: independence, separation, and sufficiency Barocas et al. (2019); Chouldechova (2017); Corbett-Davies et al. (2017); Dwork et al. (2012); Hardt et al. (2016b); Berk et al. (2021); Zafar et al. (2017); Kleinberg et al. (2017); Woodworth et al. (

