FAIREE: FAIR CLASSIFICATION WITH FINITE-SAMPLE AND DISTRIBUTION-FREE GUARANTEE

Abstract

Algorithmic fairness plays an increasingly critical role in machine learning research. Several group fairness notions and algorithms have been proposed. However, the fairness guarantee of existing fair classification methods mainly depends on specific data distributional assumptions, often requiring large sample sizes, and fairness could be violated when there is a modest number of samples, which is often the case in practice. In this paper, we propose FaiREE, a fair classification algorithm that can satisfy group fairness constraints with finite-sample and distribution-free theoretical guarantees. FaiREE can be adapted to satisfy various group fairness notions (e.g., Equality of Opportunity, Equalized Odds, Demographic Parity, etc.) and achieve the optimal accuracy. These theoretical guarantees are further supported by experiments on both synthetic and real data. FaiREE is shown to have favorable performance over state-of-the-art algorithms.

1. INTRODUCTION

As machine learning algorithms have been increasingly used in consequential domains such as college admission Chouldechova & Roth (2018) , loan application Ma et al. (2018) , and disease diagnosis Fatima et al. (2017) , there are emerging concerns about the algorithmic fairness in recent years. When standard machine learning algorithms are directly applied to the biased data provided by humans, the outputs are sometimes found to be biased towards certain sensitive attribute that we want to protect (race, gender, etc). To quantify the fairness in machine learning algorithms, many fairness notions have been proposed, including the individual fairness notion Biega et al. (2018) , group fairness notions such as Demographic Parity, Equality of Opportunity, Predictive Parity, and Equalized Odds (Dieterich et al., 2016; Hardt et al., 2016; Gajane & Pechenizkiy, 2017; Verma & Rubin, 2018) , and multi-group fairness notions including multi-calibration Hébert-Johnson et al. (2018) and multi-accuracy Kim et al. (2019) . Based on these fairness notions or constraints, corresponding algorithms were designed to help satisfy the fairness constraints (Hardt et al., 2016; Pleiss et al., 2017; Zafar et al., 2017b; Krishnaswamy et al., 2021; Valera et al., 2018; Chzhen et al., 2019; Zeng et al., 2022; Thomas et al., 2019) . Among these fairness algorithms, postprocessing is a popular type of algorithm which modifies the output of the model to satisfy fairness constraints. However, recent post-processing algorithms are found to lack the ability to realize accuracy-fairness trade-off and perform poorly when the sample size is limited (Hardt et al., 2016; Pleiss et al., 2017) . In addition, since most fairness constraints are non-convex, some papers propose convex relaxation-based methods Zafar et al. (2017b); Krishnaswamy et al. (2021) . This type of algorithms generally do not have the theoretical guarantee of how the output satisfies the exact original fairness constraint. Another line of research considers recalibrating the Bayes classifier by a group-dependent threshold (Valera et al., 2018; Chzhen et al., 2019; Zeng et al., 2022) . However, their results require either some distributional assumptions or infinite sample size, which is hard to verify/satisfy in practice. In this paper, we propose a post-processing algorithm FaiREE that provably achieves group fairness guarantees with only finite-sample and free of distributional assumptions (this property is also called "distribution-free" in the literature (Maritz, 1995; Clarke, 2007; Györfi et al., 2002) ). To the best of our knowledge, this is the first algorithm in fair classification with a finite-sample and distributionfree guarantee. A brief pipeline of FaiREE is to first score the dataset with the given classifier, and select a candidate set based on these scores which can fit the fairness constraint with a theoretical guarantee. As there are possibly multiple classifiers that can satisfy this constraint, we further develop a distribution-free estimate of the test mis-classification error, resulting in an algorithm that produces the optimal mis-classification error given the fairness constraints. As a motivating example, Figure 1 shows that applying state-of-the-art FairBayes method in Zeng et al. ( 2022) on a dataset with 1000 samples results in substantial fairness violation on the test data and incorrect behavior of fairness-accuracy trade-off due to lack of fairness generalization. Our proposed FaiREE improved fairness generalization in these finite sample settings. Additional Related Works. The fairness algorithms in the literature can be roughly categorized into three types: 1). Pre-processing algorithms that learn a fair representation to improve fairness (Zemel et al., 2013; Louizos et al., 2015; Lum, 2016; Adler et al., 2018; Calmon et al., 2017; Gordaliza et al., 2019; Madras et al., 2018; Kilbertus et al., 2020) 2). In-processing algorithms that optimize during training time (Calders et al., 2009; Woodworth et al., 2017; Zafar et al., 2017b; a; Agarwal et al., 2018; Russell et al., 2017; Zhang et al., 2018; Celis et al., 2019) 3). Post-processing algorithms that try to modify the output of the original method to fit fairness constraints (Kamiran et al., 2012; Feldman, 2015; Hardt et al., 2016; Fish et al., 2016; Pleiss et al., 2017; Corbett-Davies et al., 2017; Menon & Williamson, 2018; Hébert-Johnson et al., 2018; Kim et al., 2019; Deng et al., 2023) . The design of post-processing algorithms with distribution-free and finite-sample guarantees gains much attention recently due to its flexibility in practice Shafer & Vovk (2008); Romano et al. ( 2019), as it can be applied to any given algorithm (eg. a black-box neural network), and achieve desired theoretical guarantee with almost no assumption. One of the research areas that satisfies this property is conformal prediction (Shafer & Vovk, 2008; Lei et al., 2018; Romano et al., 2019) whose aim is to construct prediction intervals that cover a future response with high probability. In this paper, we extend this line of research beyond prediction intervals, by designing classification algorithms that satisfy certain group fairness with distribution-free and finite-sample guarantees. Paper Organization. Section 2 provides the definitions and notations we use in the paper. Section 3 provides the general pipeline of FaiREE. In Section 4, we further extend the results to other fairness notions. Finally, Section 5 conducts experiments on both synthetic and real data and compares with several state-of-art algorithms to show that FaiREE has desirable performancefoot_0 .

2. PRELIMINARY

In this paper, we consider two types of features in classification: the standard feature X ∈ X , and the sensitive attribute, which we want the output to be fair on, is denoted as A ∈ A = {0, 1}. For the simplicity of presentation, we consider the binary classification problem with labels in Y = {0, 1}. We note that our analysis can be similarly extended to the multi-class and multi-attribute setting. To address the algorithmic fairness problem, several group fairness notions have been developed in the literature. In the following, we introduce two of the popular notions, Equality of Opportunity and Equalized Odds. We will discuss other fairness notions in Section A.8 of the Appendix. Equality of Opportunity requires comparable true positive rates across different protected groups.



Code is available at https://github.com/lphLeo/FaiREE



Figure 1: Comparison of FairBayes and FaiREE on the synthetic data with sample size = 1000. See Table 2 for detailed numerical results. Left: DEOO v.s. α, Right: DEOO v.s. Test accuracy. Here, DEOO is the degree of violation to fairness constraint Equality of Opportunity and α is the prespecified desired level to upper bound DEOO for both methods. See Eq. (1) in Section 2 for a more detailed definition.

Under the binary classification setting, we use the score-based classifier that outputs a prediction Y = Y (x, a) ∈ {0, 1} based on a score function f (x, a) ∈ [0, 1] that depends on X and A: Definition 1. (Score-based classifier) A score-based classifier is an indication function Ŷ = ϕ(x, a) = 1{f(x, a) > c} for a measurable score function f : X × {0, 1} → [0, 1] and some threshold c > 0.

