FAIR DIFFERENTIAL PRIVACY CAN MITIGATE THE DISPARATE IMPACT ON MODEL ACCURACY Anonymous authors Paper under double-blind review

Abstract

The techniques based on the theory of differential privacy (DP) has become a standard building block in the machine learning community. DP training mechanisms offer strong guarantees that an adversary cannot determine with high confidence about the training data based on analyzing the released model, let alone any details of the instances. However, DP may disproportionately affect the underrepresented and relatively complicated classes. That is, the reduction in utility is unequal for each class. This paper proposes a fair differential privacy algorithm (FairDP) to mitigate the disparate impact on each class's model accuracy. We cast the learning procedure as a bilevel programming problem, which integrates differential privacy with fairness. FairDP establishes a self-adaptive DP mechanism and dynamically adjusts instance influence in each class depending on the theoretical bias-variance bound. Our experimental evaluation shows the effectiveness of FairDP in mitigating the disparate impact on model accuracy among the classes on several benchmark datasets and scenarios ranging from text to vision.

1. INTRODUCTION

Protecting data privacy is a significant concern in many data-driven decision-making applications (Zhu et al., 2017) , such as social networking service, recommender system, location-based service. For example, the United States Census Bureau will firstly employ differential privacy to the 2020 census data (Bureau, 2020) . Differential privacy (DP) guarantees that the released model cannot be exploited by attackers to derive whether one particular instance is present or absent in the training dataset (Dwork et al., 2006) . However, DP intentionally restricts the instance influence and introduces noise into the learning procedure. When we enforce DP to a model, DP may amplify the discriminative effect towards the underrepresented and relatively complicated classes (Bagdasaryan et al., 2019; Du et al., 2020; Jaiswal & Provost, 2020) . That is, reduction in accuracy from nonprivate learning to private learning may be uneven for each class. There are several empirical studies on utility reduction: (Bagdasaryan et al., 2019; Du et al., 2020) show that the model accuracy in private learning tends to decrease more on classes that already have lower accuracy in non-private learning. (Jaiswal & Provost, 2020) shows different observations that the inequality in accuracy is not consistent for classes across multiple setups and datasets. It needs to be cautionary that although private learning improves individual participants' security, the model performance should not harm one class more than others. The machine learning model, specifically in supervised learning tasks, outputs a hypothesis f (x; θ) parameterized by θ, which predicts the label y given the unprotected attributes x. Each instance's label y belongs to a class k. The model aims to minimize the objective (loss) function L(θ; x, y), i.e., θ * := arg min θ E [L(θ; x, y)] . (1) Our work builds on a recent advance in machine learning models' training that uses the differentially private mechanism, i.e., DPSGD (Abadi et al., 2016) for releasing model. The key idea can be extended to other DP mechanisms with the specialized noise form (generally Laplacian or Gaussian distribution). The iterative update scheme of DPSGD at the (t + 1)-th iteration is of the form  θt+1 = θt -µ t • 1 n i∈S t g t (x i ) max(1, g t (xi) 2 C ) + ξ1 , E f (θ t+1 ) f (θ t ) + E ∇f (θ t ), θ t+1 -θ t + L 2 E θ t+1 -θ t 2 + τ (C, σ; θ t ), where the last term τ (C, σ; θ t ) denotes the gap of loss expectation compared with ideal SGD at this (t + 1)-th iteration, and related with parameters C, and σ. The term τ (C, σ; θ), which can be called bias-variance term, can be calculated mathematically as 2(1 + 1 µ t L ) ∇f (θ) • η + η 2 Clipping bias + 1 n 2 • σ 2 C 2 |1| Noise variance , where L denotes the Lipschitz constant of f ; |1| denotes the vector dimension; and we have η := 1 n I g t (x i ) >C g t (x i ) -C , where I g t (xi) >C denotes the cardinality number of satisfying g t (x i ) > C. The detailed proof of ( 3) and ( 4) can be found in Appendix A. τ (C, σ) is consist of clipping bias and noise variance terms, which means the amount that the private gradient differs from the non-private gradient due to the influence truncation and depending on the scale of the noise respectively. As a result, we call τ (C, σ) the bias-variance term. As underrepresented class instances or complicated instances manifest differently from common instances, a uniform threshold parameter C may incur significant accuracy disparate for different classes. In Figure 1 (a), we employ DPSGD (Abadi et al., 2016) on the unbalanced MNIST dataset (Bagdasaryan et al., 2019) (Bagdasaryan et al., 2019; Du et al., 2020; Jaiswal & Provost, 2020) . Both theoretical analysis and experimental discussion suggest that minimizing the clipping bias and noise variance simultaneously could learn "better" DP parameters, which mitigates the accuracy bias between different classes. This motivates us to pursue fairness with a self-adaptive differentially privacy schemefoot_0 . This paper proposes a fair differential privacy algorithm (FairDP) to mitigate the disparate impact problem. FairDP introduces a self-adaptive DP mechanism and automatically adjusts instance influence in each class. The main idea is to formulate the problem as bilevel programming by minimizing the bias-variance term as the upper-level objective with a lower-level differential privacy machine learning model. The self-adaptive clipping threshold parameters are calculated by balancing the fairness bias-variance and per-class accuracy terms simultaneously. Our contributions can be summarized as follows: • FairDP uses a self-adaptive clipping threshold to adjust the instance influence in each class, so the model accuracy for each class is calibrated based on their privacy cost through fairness balancing. The class utility reduction is semblable for each class in FairDP. • To our knowledge, we are the first to introduce bilevel programming to private learning, aiming to mitigate the disparate impact on model accuracy. We further design an alternating scheme to learn the self-adaptive clipping and private model simultaneously. • Our experimental evaluation shows that FairDP strikes a balance among privacy, fairness, and accuracy by performing stratified clipping over different subclasses. The following is the road-map of this paper. Section 2 describes the proposed FairDP algorithm. In Section 3, we provide a brief but complete introduction to related works in privacy-aware learning, fairness-aware learning, and the intersection of differential privacy and fairness. Extensive experiments are further presented in Section 4, and we finally conclude this paper and discuss some future work in Section 5.

2. FAIRDP: FAIR DIFFERENTIAL PRIVACY

2.1 THE BILEVEL FAIRDP FORMULATION Our approach's intuition is to fairly balance the level of privacy (based on the clipping threshold) for each class based on their bias-variance terms, which are introduced by associated DP. The biasvariance terms arise from capping instance influences to reduce the sensitivity of a machine learning algorithm. In detail, a self-adaptive DP mechanism is designed to balance the bias-variance difference among all groups, while the obtained DP mechanism must adapt to the original machine learning problem simultaneously. Recall the definition of the machine learning problem, we assume there are classes and according to the bias-variance term (4) for class k ∈ {1, • • • , } can be denoted as τ k (C k , σ; θ * ) := 2(1 + 1 µ t L ) ∇f (θ * ) • η k + η k 2 + |G k | 2 n 2 • σ 2 C k 2 |1|, where C k denotes the clipping parameter for class k; G k denotes the data sample set for class k. As motivated by Section 1, we aim to minimize the associated bias-variance term to obtain a unified clipping parameter for the machine learning problem. However, to mitigate the disparate impact on model accuracy for different classes, we minimize the summation of per-class bias-variance terms. This objective can lead to the self-adaptive clipping threshold among different classes, while the inconsistent DP schemes for different classes should work on the privacy protection on the machine learning model. The self-adaptive clipping threshold parameters should be utilized to learn the original machine learning privately with the DP mechanism. A simple bilevel programming problemfoot_1 (Dempe et al., 2019; Liu et al., 2019 ) is introduced to model these two goals which influence each Step 8-12 Step 4-6 Step 7 other. The formulation can be denoted as follows, i.e., min {C k },θ k=1 τ k (C k , σ; θ), s.t. θ ∈ arg min θ L(θ; {G k } k=1 ), where the upper-level problem (6a) aims to fairly adjust the clipping threshold parameters for all classes, which is related to the classification model θ; as for the lower-level problem (6b), we aim to learn the classification model based on the differential privacy schema with the self-adaptive clipping threshold {C k }. These two objectives are coupled together, although the model of the lower-level problem is determined only by θ. The effect of clipping is reflected through the DP calculation procedure. Guided by the bias-variance term in (6a), the parameters of the DP learning can be finely updated simultaneously with the learning process of the classifiers in (6b). Algorithm 1: The FairDP Method Input : Instances {(x 1 , y 1 ),  for x i ∈ S t do Compute g t (x i ) ← ∇ θ t L(y i ; θ t , x i ); end Minimize bias-variance C t+1 k ← arg min C k τ k (C k , σ; θ t ); for x i ∈ S t and y i = k do Clip gradient ḡt (x i ) ← g t (xi) max 1, g t (x i ) 2 C t+1 k ; end Add noise gt ← 1 n ( i ḡt (x i ) + ξI); Noise Gradient Descent θ t+1 ← θ t -µ t gt ; end Output: θ T , accumulated privacy cost ( , δ).

2.2. THE FAIRDP METHOD

Calculating the optimal θ * and {C * k } require two nested loops of optimization, and we adopt an alternating strategy to update θ and {C k } respectively. The FairDP method is summarized in Algorithm 1, while Figure 2 illustrates its main process (i.e., Steps 4-12 in Algorithm 1). For the self-adaptive clipping step (6a) in upper-level, given the current obtained model parameter θ t , we can calculate optimal {C k } directly by solving a quadratic programming problem. Based on a batch of training samples S t (Step 3), the overall gradient ∇f (θ t ) are approximately estimated by ∇ f (θ t ) with the samples in S t . As for private training step (6b) in lower-level, the updating equation of the classifier parameter can be formulated by moving the current θ t along the descent direction of the objective loss in (6b) on a batch training data (Step 4-6): g t (x i ) := ∇ θ t L(y i ; θ t , x i ) After receiving the classifier parameter updating g t (x i ) from ( 7), the updated θ t+1 is employed to ameliorate the parameter θ of the classifier in a DP way as (2) with the obtained clipping parameters {C k } from Step 7 (Step 8-12). All the details can be found in Algorithm 1.

3.1. PRIVACY-AWARE LEARNING

Existing literature in differentially private machine learning can be divided into three main categories, input perturbation, output perturbation, and inner perturbation. The input perturbation approach adds noise to the input data based on the differential privacy model. The output perturbation approach adds noise to the model after the training procedure finishes, i.e., without modifying the training algorithm. The inner perturbation approach modifies the learning algorithm such that the noise is injected during training. (Chaudhuri et al., 2011) modifies the objective of the training procedure. DPSGD (Abadi et al., 2016) adds noise to the gradient of each step of the training without modifying the objective. Limiting users to small influences keeps noise level low at the cost of introducing bias. Several works study how to adaptively bound users' influences and clip the model parameters to improve learning accuracy and robustness. DPSGD fails to provide a detailed analysis of how to choose the gradient norm's truncation level, instead suggesting using the median of observed gradients. Using the median (or any fixed quantile independent of the privacy parameter ) as a cap can yield suboptimal estimations of a sum (Amin et al., 2019 Andrew, 2018 ) does a pre-processing step via the scaling operation. DP-GAN (Zhang et al., 2018) assumes that we have access to a small amount of public data, which is used to monitor the change of gradient magnitudes and set the clipping parameters based on the average magnitudes. (Amin et al., 2019) characterizes the trade-off between bias and variance, and shows that a proper bound can be found depending on properties of the dataset. It does not matter how large or small the gradients are above or below the cutoff, only that a fixed number of values are clipped. (Thakkar et al., 2019) sets an adaptive clipping norm based on a differentially private estimate of a targeted quantile of the distribution of unclipped norms. AdaClip (Pichapati et al., 2019) uses coordinate-wise adaptive clipping of the gradient to achieve the same privacy guarantee with much less added noise. Previous work either ignores computing the trade-off completely (DPSGD (Abadi et al., 2016) simply uses the empirical median, DP-FedAvg (McMahan et al., 2018) scatter privacy budget evenly over the layers), or requires strong assumptions on the data ( (Zhang et al., 2018) assumes the accessibility of public data).

3.2. FAIRNESS-AWARE LEARNING

Fairness is a broad topic that has received much attention in the machine learning community. However, the goals often differ from those described in this work. Most researches on fairness-aware machine learning study the discriminatory prediction problem: how can we reduce the discrimination against the protected attribute in the predictive decision made by machine learning model (Dwork et al., 2012; Hardt et al., 2016; Kusner et al., 2017) . Three common approaches are to preprocess the data to remove information about the protected attribute (Zemel et al., 2013) , optimize the objective function under some fairness constraints during training (Zafar et al., 2017) , or post-process the model by adjusting the prediction threshold after classifiers are trained (Hardt et al., 2016) . The others study the discriminatory impact problem (Kusner et al., 2019) : how can we reduce the discrimination arising from the impact of decisions. In federated learning, AFL (Mohri et al., 2019) has taken a step towards addressing accuracy parity by introducing good-intent fairness. The goal is to ensure that the training procedure does not overfit a model to any one class at another's expense. However, the proposed objective is rigid because it only maximizes the performance of the worst class and has only been applied at small scales (for a handful of devices). q-FFL (Li et al., 2020) reweighs the objective function in FedAvg to assign higher relative weight to classes with higher loss, which reduces the variance of model performance. Although accuracy parity enforces equal error rates among specific classes (Zafar et al., 2017) , our goal is not to optimize for identical accuracy across all classes, and we focus on the inequality introduced by differential privacy.

3.3. DIFFERENTIAL PRIVACY AND FAIRNESS

Recent works study the connection between achieving privacy protection and fairness. (Dwork et al., 2012) proposes a notion of fairness that is a generalization of differential privacy. ADFC (Ding et al., 2020) , DP-POSTPROCESSING/DP-ORACLE-LEARNER (Jagielski et al., 2019) , PFLR* (Xu et al., 2019) 

4. EXPERIMENTS

This section reports our evaluation of the fair differential private learning on some benchmark datasets from text to vision. We use PyTorch 1.6.0 to implement all the methods with only one NVIDIA GeForce RTX 2080Ti. Datasets: three datasets are used, including the Adult (Dua & Graff, 2017) , the Dutch (Kamiran & Calders, 2011) and the Unbalanced MNIST (LeCun et al., 1998) foot_2 . The details can be found in Appendix B. Comparison methods: 1) SGD: non-private learning without clipping and noise-addition; 2) DPSGD (Abadi et al., 2016) : private learning with flat clipping; 3) DP-FedAvg (McMahan et al., 2018) : private learning with per-layer clipping; 4) Opt-Q: private learning with (1 -2/π • σ/e)quantile clipping, which is adapted from (Amin et al., 2019) and details can be found in Appendix C.1; 5) DPSGD-F (Xu et al., 2020) : private learning with clipping proportional to the relative ratio of gradients exceeding the threshold. More details in Appendix C. Settings and hyper-parameters: Without loss of generalization, we assume that the function f is 1-Lipschitz. For the Adult and Dutch datasets, we employ a logistic regression machine learning model. Then for logistic regression, the DP-FedAvg will degenerate to classical DPSGD. The noise scale σ, clipping bound and δ are set to be 1, 0.5 and 10 -5 respectively. For the Unbalanced MNIST dataset, we employ a neural network with 2 convolutional layers and 2 fully-connected layers. The noise scale σ, clipping bound and δ are set to be 1, 1 and 10 -5 respectively. More Details can be found in Appendix D.3. To evaluate the efficiency of the proposed FairDP, we aim to complete the following three tasks, i.e., 1) Fairness performance: not only the utility loss is small, but also we can obtain more fair utility loss on different classes; 2) Privacy performance: the proposed FairDP method can preserve the privacy of training data; 3) Adaptive performance: the effect of hyper-parameters on compared private methods. In this experiment, we consider not only the learning rate (Figure 4 ) but the batch size (Figure 5 ) and noise (Figure 6 ), to show the sensitivity of compared private methods on these three hyperparameters. Figure 4 ,5 and 6 show that FairDP is insensitive to variations in the hyper-parameters. Because of the space limitation, more results on the Dutch dataset are moved to Appendix E.

5. CONCLUSION

Gradient clipping and noise addition, which are the core techniques in DPSGD, disproportionately affect underrepresented and complex classes. As a consequence, the accuracy of a model trained using DPSGD tends to decrease more on these classes vs. the original, non-private model. If the original model is unfair because its accuracy is not the same across all subgroups, DPSGD may exacerbate this unfairness. We propose FairDP, which aims to remove the potential disparate impact of differential privacy mechanisms on the protected group. FairDP adjusts the influence of samples in a group depending on the group clipping bias such that differential privacy has no disparate impact on group utility. In future work, we can further improve our adaptive clipping method from group-wise adaptive clipping to an element-wise adaptive clipping from the user and/or parameter perspectives, and then the model could be fair even to the unseen minority class.



Note that we do not attempt to optimize the bias-variance bound in a differentially private way, and we are most interested in understanding the forces at play. The simple bilevel programming is not to say that the bilevel problem is simple, and it denotes a specific bilevel programming problem. The original MNIST dataset is modified by reducing the number of training samples in Class 8 to 500 r fp and rtp denotes the false positive rate and true positive rate respectively.



Figure 2: Main flowchart of the FairDP Method (Step 4-12 in Algorithm 1)

Figure 3: Privacy vulnerability comparison

Figure 5: Effect of batch size on training procedure

Effect of clipping and noise in differentially private mechanism and τ (C, σ; θ t ) on MNIST dataset where n and µ t denote the batch size and step-size (learning rate) respectively; S t denotes the randomly chosen instance set; the vector 1 denotes the vector filled with scalar value one; and g

achieve fairness in addition to enforcing differential privacy in the private model. Most existing work focuses on preventing private information extraction while reaching acceptable fairness performance. Minority work focuses on the accuracy disparity among classes with different protected attributes caused by differential privacy. DPSGF-F(Xu et al., 2020) prevents the disparate impact of the privacy model on model accuracy across different groups by scales the clipping bound with relative ratio. Different from their restriction of the fraction of instances with gradient norms exceeding the clipping threshold, our analyses quantify the bias-variance with sufficient decrease difference between non-private and private learning.

Utility loss for SGD on total population, well-represented group (Class 2 in Unbalanced MNIST and Male in Adult/Dutch) and underrepresented group(Class 8 in Unbalanced MNIST and Female in Adult/Dutch). ±.26 95.96 ±.82 82.62 ±.08 78.45 ±.09 91.22 ±.06 81.74 ±.43 86.26 ±.30 77.17 ±.69 DPSGD -11.86 ±1.09 -6.73 ±.79 -74.31 ±5.58 -7.10 ±.06 -9.31 ±.08 -2.54 ±.05 -3.82 ±.57 -0.73 ±.28 -6.94 ±1.02 DP-FedAvg -10.85 ±.63 -5.95 ±.54 -73.87 ±3.69 -7.10 ±.06 -9.31 ±.08 -2.54 ±.05 -3.82 ±.57 -0.73 ±.28 -6.94 ±1.02 ±.69 -16.43 ±3.21 -0.68 ±.07 -0.67 ±.09 -0.71 ±.08 -1.13 ±.18 +0.67 ±.23 -2.94 ±.45 FairDP -0.65 ±.13 -1.20 ±.69 -0.55 ±1.20 +0.03 ±.09 +0.02 ±.13 +0.05 ±.04 -0.03 ±.15 +0.01 ±.34 -0.07 ±.51

Model fairness comparison ±0.0003 0.6567 ±0.1644 0.6382 ±0.1171 0.0007 ±0.0002 0.0167 ±0.0055 0.0007 ±0.0004 Gini Index 0.0474 ±0.0087 0.8791 ±0.0724 0.8524 ±0.0463 0.0643 ±0.0092 0.2317 ±0.0326 0.0581 ±0.0114 MLD 0.0006 ±0.0003 0.6807 ±0.1762 0.6602 ±0.1253 0.0007 ±0.0002 0.0167 ±0.0055 0.0007 ±0.0004 Theil Index 0.0006 ±0.0003 0.4346 ±0.0826 0.4267 ±0.0588 0.0007 ±0.0002 0.0159 ±0.0051 0.0007 ±0.0004 Adult Atkinson Index 0.0253 ±0.0002 0.0695 ±0.0001 0.0695 ±0.0001 0.0232 ±0.0007 0.0255 ±0.0005 0.0253 ±0.0005 Gini Index 0.3389 ±0.0017 0.5678 ±0.0005 0.5678 ±0.0005 0.3244 ±0.0050 0.3407 ±0.0036 0.3394 ±0.0032 MLD 0.0253 ±0.0002 0.0697 ±0.0001 0.0697 ±0.0001 0.0232 ±0.0007 0.0256 ±0.0005 0.0254 ±0.0005 Theil Index 0.0257 ±0.0002 0.0714 ±0.0001 0.0714 ±0.0001 0.0236 ±0.0007 0.0260 ±0.0005 0.0258 ±0.0005 Dutch Atkinson Index 0.0150 ±0.0029 0.0477 ±0.0096 0.0477 ±0.0096 0.0251 ±0.0027 0.0304 ±0.0035 0.0152 ±0.0022 Gini Index 0.2726 ±0.0274 0.4855 ±0.0486 0.4855 ±0.0486 0.3536 ±0.0189 0.3886 ±0.0222 0.2750 ±0.0198 MLD 0.0150 ±0.0029 0.0478 ±0.0096 0.0478 ±0.0096 0.0251 ±0.0027 0.0304 ±0.0035 0.0152 ±0.0022 Theil Index 0.0150 ±0.0029 0.0477 ±0.0096 0.0477 ±0.0096 0.0251 ±0.0027 0.0303 ±0.0035 0.0152 ±0.0022

Table 2 provides the comparison results on model accuracy and fairness after implementing different DP methods. Table 1 presents the utility loss of different private learning methods w.r.t. classical SGD. Table 2 presents the comparison on four fairness indexes(Bureau, 2016), including Atkinson Index, Gini Index, MLD and Theil Index (Appendix D.4). In most cases, FairDP has the least accuracy loss from SGD than other methods and offers equal fair statistics as SGD (a lower value is better). Although Opt-Q has little improvement in fairness on the Adult dataset, it reduces both per-class and overall accuracies. Overall, FairDP can outperform other private learning methods on both model fairness and accuracy and balance model fairness and accuracy.

