EQUAL IMPROVABILITY: A NEW FAIRNESS NOTION CONSIDERING THE LONG-TERM IMPACT

Abstract

Devising a fair classifier that does not discriminate against different groups is an important problem in machine learning. Recently, effort-based fairness notions are getting attention, which considers the scenarios of each individual making effort to improve its feature over time. Such scenarios happen in the real world, e.g., college admission and credit loaning, where each rejected sample makes effort to change its features to get accepted afterward. In this paper, we propose a new effortbased fairness notion called Equal Improvability (EI), which equalizes the potential acceptance rate of the rejected samples across different groups assuming a bounded level of effort will be spent by each rejected sample. We also propose and study three different approaches for finding a classifier that satisfies the EI requirement. Through experiments on both synthetic and real datasets, we demonstrate that the proposed EI-regularized algorithms encourage us to find a fair classifier in terms of EI. Additionally, we ran experiments on dynamic scenarios which highlight the advantages of our EI metric in equalizing the distribution of features across different groups, after the rejected samples make some effort to improve. Finally, we provide mathematical analyses of several aspects of EI: the relationship between EI and existing fairness notions, and the effect of EI in dynamic scenarios. Codes are available in a GitHub repository 1 .

1. INTRODUCTION

Over the past decade, machine learning has been used in a wide variety of applications. However, these machine learning approaches are observed to be unfair to individuals having different ethnicity, race, and gender. As the implicit bias in artificial intelligence tools raised concerns over potential discrimination and equity issues, various researchers suggested defining fairness notions and developing classifiers that achieve fairness. One popular fairness notion is demographic parity (DP), which requires the decision-making system to provide output such that the groups are equally likely to be assigned to the desired prediction classes, e.g., acceptance in the admission procedure. DP and related fairness notions are largely employed to mitigate the bias in many realistic problems such as recruitment, credit lending, and university admissions (Zafar et al., 2017b; Hardt et al., 2016; Dwork et al., 2012; Zafar et al., 2017a) . However, most of the existing fairness notions only focus on immediate fairness, without taking potential follow-up inequity risk into consideration. In Fig. 1 , we provide an example scenario when using DP fairness has a long-term fairness issue, in a simple loan approval problem setting. Consider two groups (group 0 and group 1) with different distributions, where each individual has one label (approve the loan or not) and two features (credit score, income) that can be improved over time. Suppose each group consists of two clusters (with three samples each), and the distance between the clusters is different for two groups. Fig. 1 visualizes the distributions of two groups and the decision boundary of a classifier f which achieves DP among the groups. We observe that the rejected samples (left-hand-side of the decision boundary) in group 1 are located further away from the decision boundary than the rejected samples in group 0. As a result, the rejected applicants in group 1 need to make more effort to cross the decision boundary and get approval. This improvability gap between the two groups can make the rejected applicants in group 1 less motivated to improve their features, which may increase the gap between different groups in the future. Decision Boundary of Classifier f {x : f(x) ≥ 0.5}

Δx

Group 0 Group 1 Figure 1: Toy example showing the insufficiency of fairness notion that does not consider improvability. We consider the binary classification (accept/reject) on 12 samples (dots), where x is the feature of the sample and the color of the dot represents the group. The given classifier f is fair in terms of a popular notion called demographic parity (DP), but does not have equal improvability of rejected samples (f (x) < 0.5) in two groups; the rejected samples in group 1 needs more effort ∆x to be accepted, i.e., f (x + ∆x) ≥ 0.5, compared with the rejected samples in group 0. This motivated the advent of fairness notions that consider dynamic scenarios when each rejected sample makes effort to improve its feature, and measure the group fairness after such effort is made Gupta et al. ( 2019 However, as shown in Table 1 , they have various limitations e.g., vulnerable to imbalanced group negative rates or outliers. In this paper, we introduce another fairness notion designed for dynamic scenarios, dubbed as Equal Improvability (EI), which does not suffer from these limitations. Let x be the feature of a sample and f be a scorebased classifier, e.g., predicting a sample as accepted if f (x) ≥ 0.5 holds and as rejected otherwise. We assume each rejected individual wants to get accepted in the future, thus improving its feature within a certain effort budget towards the direction that maximizes its score f (x). Under this setting, we define EI fairness as the equity of the potential acceptance rate of the different rejected groups, once each individual makes the best effort within the predefined budget. This prevents the risk of exacerbating the gap between different groups in the long run. Our key contributions are as follows: • We propose a new group fairness notion called Equal Improvability (EI), which aims to equalize the probability of rejected samples being qualified after a certain amount of feature improvement, for different groups. EI encourages rejected individuals in different groups to have an equal amount of motivation to improve their feature to get accepted in the future. We analyze the properties of EI and the connections of EI with other existing fairness notions. • We provide three methods to find a classifier that is fair in terms of EI, each of which uses a unique way of measuring the inequity in the improvability. Each method is solving a min-max problem where the inner maximization problem is finding the best effort to measure the EI unfairness, and the outer minimization problem is finding the classifier that has the smallest fairness-regularized loss. Experiments on synthetic/real datasets demonstrate that our algorithms find classifiers having low EI unfairness. • We run experiments on dynamic scenarios where the data and the classifier evolve over multiple rounds, and show that training a classifier with EI constraints is beneficial for making the feature distributions of different groups identical in the long run. Here, improvable features x I refer to the features that can be improved and can directly affect the outcome, e.g., salary in the credit lending problem, and GPA in the school's admission problem. In contrast, manipulable features x M can be altered, but are not directly related to the outcome, e.g., marital status in the admission problem, and communication type in the credit lending problem. Although individuals may manipulate these manipulable features to get the desired outcome, we do not consider it as a way to make efforts as it does not affect the individual's true qualification status. Immutable features x IM are features that cannot be altered, such as race, age, or date of birth. Note that if sensitive attribute z is included in the feature vector, then it belongs to immutable features. For ease of notation, we write x = (x I , x M , x IM ). Let F = {f : X → [0, 1]} be the set of classifiers, where each classifier is parameterized by w, i.e., f = f w . Given f ∈ F, we consider the following deterministic prediction: ŷ | x = 1{f (x) ≥ 0.5} where 1{A} = 1 if condition A holds, and 1{A} = 0 otherwise. We now introduce our new fairness notion.



); Heidari et al. (2019); Von Kügelgen et al. (2022).

Before defining our new fairness notion called Equal Improvability (EI), we first introduce necessary notations. For an integer n, let [n] = {0, . . . , n -1}. We consider a binary classification setting where each data sample has an input feature vector x ∈ X ⊆ R d and a label y ∈ Y = {0, 1}. In particular, we have a sensitive attribute z ∈ Z = [Z], where Z is the number of sensitive groups. As suggested by Chen et al. (2021), we sort d features x ∈ R d into three categories: improvable features x I ∈ R dI , manipulable features x M ∈ R dM , and immutable features x IM ∈ R dIM , where d I + d M + d IM = d holds.

