ENFORCING DELAYED-IMPACT FAIRNESS GUARANTEES

Abstract

Recent research has shown that seemingly fair machine learning models, when used to inform decisions that have an impact on people's lives or well-being (e.g., applications involving education, employment, and lending), can inadvertently increase social inequality in the long term. Existing fairness-aware algorithms consider static fairness constraints, such as equal opportunity or demographic parity, but enforcing constraints of this type may result in models that have a negative long-term impact on disadvantaged individuals and communities. We introduce ELF (Enforcing Long-term Fairness), the first classification algorithm that provides high-confidence fairness guarantees in terms of long-term, or delayed, impact. Importantly, ELF solves the open problem of providing such guarantees based only on historical data that includes observations of delayed impact. Prior methods, by contrast, require prior knowledge (or an estimate) of analytical models describing the relationship between a classifier's predictions and their corresponding delayed impact. We prove that ELF satisfies delayed-impact fairness constraints with high confidence and that it is guaranteed to identify a fair solution, if one exists, given sufficient data. We show empirically, using real-life data, that ELF can successfully mitigate long-term unfairness with high confidence.

1. INTRODUCTION

Using machine learning (ML) for high-stakes applications, such as lending, hiring, and criminal sentencing, may potentially harm historically disadvantaged communities (Flage, 2018; Blass, 2019; Bartlett et al., 2021) . For example, software meant to guide lending decisions has been shown to exhibit racial bias (Bartlett et al., 2021) . Extensive research has been devoted to algorithmic approaches that promote fairness and ameliorate concerns of bias and discrimination for socially impactful applications. Most of this research has focused on the classification setting, in which an ML model must make predictions given information about a person or community. Most fairness definitions studied in the classification setting are static: they do not consider how a classifier's predictions impact the long-term well-being of a community (Liu et al., 2018) . In their seminal paper, Liu et al. show that classifiers' predictions that appear fair with respect to static fairness criteria can nevertheless negatively impact the long-term wellness of the community it aims to protect. Importantly, Liu et al., and others investigating long-term fairness (see Section 6), assume that the precise analytical relationship between a classifier's prediction and its long-term impact, or delayed impact (DI), is known. Designing classification algorithms that mitigate negative delayed impact when this relationship is not known, or cannot be computed analytically, has remained an open problem. We introduce ELF (Enforcing Long-term Fairness), the first classification algorithm that solves this open problem. ELF, unlike existing methods, does not require access to an analytic model of the delayed impact of a classifier's predictions. Instead, it works under the less strict assumption that the method only has access to historical data containing observations of the delayed impact that resulted from predictions of a previously-deployed classifier. Below, we illustrate this setting with an example. Loan repayment example. As a running example, consider a bank that wishes to increase its profit by maximizing successful loan repayments. The bank's decisions are informed by a classifier that predicts repayment success. These decisions may have a delayed impact in terms of the financial well-being of loan applicants, such as their savings rate or debt-to-income ratio, two years after a lending decision is made. Taking this delayed impact into account is important: when a subset of the population is disadvantaged, the bank may want (or be required by law) to maximize profit subject to a fairness constraint that considers the disadvantaged group's long-term well-being; e.g., a constraint requiring improvement in savings rates two years after a lending decision. Unfortunately, existing methods that address this problem can only be used if analytical models of how repayment predictions affect long-term financial well-being are available. Constructing such models is challenging: many complex factors (e.g., social, economic) influence how different demographic groups may be affected by financial decisions. ELF, by contrast, can ensure delayed-impact fairness with high confidence as long as the bank can collect data about the delayed impact resulting from predictions made by a previously-deployed classifier. Suppose the bank deployed a classifier, which informed lending decisions, and logged the real-valued savings rate of each client two years later (i.e., logged the observed delayed impact associated with that client). ELF uses this historical data to train a new classifier with high-confidence fairness guarantees in terms of long-term, or delayed, impact.foot_0 Importantly, ELF works with all measures of delayed impact that can be empirically observed or quantified. Appendix A describes other real-life problems where ELF could be applied. Contributions. We present ELF, the first method capable of enforcing DI fairness when the analytical relationship between predictions and DI is not known. To accomplish this, we simultaneously formulate the fair classification problem as both a classification and a reinforcement learning problem-classification for optimizing the primary objective (a measure of classification loss) and reinforcement learning when considering DI. We prove that 1) the probability that ELF returns a fair model (in terms of DI) is at least (1 -δ), where δ is a user-specified confidence level; and 2) given sufficient training data, ELF is able to find and return a solution that is fair if one exists. We empirically analyze ELF's performance on data from the National Data Archive on Child Abuse and Neglect (NDACAN, 2021), while varying both the amount of training data and the influence that a classifier's predictions have on delayed impact. Limitations and future work. ELF requires access to representative historical data-i.e., a dataset with observations of the delayed impact resulting from different predictions. For example, consider a college making admission decisions informed by predictions of students' academic performance. Such predictions may have a delayed impact, e.g., on whether a student will be employed after graduation. The college would only have access to this information about admitted students; in this case, ELF would not be applicable. However, in many important real-life settings it is possible to observe the delayed impact of different decisions (see Appendix A). In our lending example, for instance, the bank can observe the delayed impact of lending decisions both for clients who received a loan and those who did not. In Appendix B, we further discuss ELF's limitations. ELF can be extended in many ways. Section 6 discusses existing methods that address alternative long-term fairness settings; e.g., when multiple prediction steps are involved. Importantly, all of these methods require an analytic expression of the relationship between predictions and delayed impact, while ELF requires access only to historical data. As future work, ELF could be extended to tackle these alternative settings.

2. PROBLEM STATEMENT

We now formalize the problem of classification with delayed-impact fairness guarantees. As in the standard classification setting, a dataset consists of n data points. Each i th data point contains X i , a feature vector describing, e.g., a person, and a label Y i . It also contains a set of sensitive attributes, such as race and gender. ELF supports an arbitrary number of sensitive attributes, but for brevity our notation uses a single attribute, T i . We assume each data point also contains a prediction, Y β i , made by a stochastic classifier, β. We call β the behavior model, defined as β(x, ŷ):= Pr( Y β i =ŷ|Xi=x). Let I β i be a real-valued delayed-impact observation (DIO) resulting from deploying β for the person described by the i th data point. In our running example, I β i corresponds to the empirically-observed savings rate two years after the prediction Y β i was used to decide whether the i th client should get a loan. We assume that larger values of I β i correspond to better DI. We append I β i to each data point and



In Appendix E we show how ELF can be easily extended to enforce long-term and static fairness constraints simultaneously.

