FAIR DIFFERENTIAL PRIVACY CAN MITIGATE THE DISPARATE IMPACT ON MODEL ACCURACY Anonymous authors Paper under double-blind review

Abstract

The techniques based on the theory of differential privacy (DP) has become a standard building block in the machine learning community. DP training mechanisms offer strong guarantees that an adversary cannot determine with high confidence about the training data based on analyzing the released model, let alone any details of the instances. However, DP may disproportionately affect the underrepresented and relatively complicated classes. That is, the reduction in utility is unequal for each class. This paper proposes a fair differential privacy algorithm (FairDP) to mitigate the disparate impact on each class's model accuracy. We cast the learning procedure as a bilevel programming problem, which integrates differential privacy with fairness. FairDP establishes a self-adaptive DP mechanism and dynamically adjusts instance influence in each class depending on the theoretical bias-variance bound. Our experimental evaluation shows the effectiveness of FairDP in mitigating the disparate impact on model accuracy among the classes on several benchmark datasets and scenarios ranging from text to vision.

1. INTRODUCTION

Protecting data privacy is a significant concern in many data-driven decision-making applications (Zhu et al., 2017) , such as social networking service, recommender system, location-based service. For example, the United States Census Bureau will firstly employ differential privacy to the 2020 census data (Bureau, 2020) . Differential privacy (DP) guarantees that the released model cannot be exploited by attackers to derive whether one particular instance is present or absent in the training dataset (Dwork et al., 2006) . However, DP intentionally restricts the instance influence and introduces noise into the learning procedure. When we enforce DP to a model, DP may amplify the discriminative effect towards the underrepresented and relatively complicated classes (Bagdasaryan et al., 2019; Du et al., 2020; Jaiswal & Provost, 2020) . That is, reduction in accuracy from nonprivate learning to private learning may be uneven for each class. There are several empirical studies on utility reduction: (Bagdasaryan et al., 2019; Du et al., 2020) show that the model accuracy in private learning tends to decrease more on classes that already have lower accuracy in non-private learning. (Jaiswal & Provost, 2020) shows different observations that the inequality in accuracy is not consistent for classes across multiple setups and datasets. It needs to be cautionary that although private learning improves individual participants' security, the model performance should not harm one class more than others. The machine learning model, specifically in supervised learning tasks, outputs a hypothesis f (x; θ) parameterized by θ, which predicts the label y given the unprotected attributes x. Each instance's label y belongs to a class k. The model aims to minimize the objective (loss) function L(θ; x, y), i.e., θ * := arg min (1) Our work builds on a recent advance in machine learning models' training that uses the differentially private mechanism, i.e., DPSGD (Abadi et al., 2016) for releasing model. The key idea can be extended to other DP mechanisms with the specialized noise form (generally Laplacian or Gaussian distribution). The iterative update scheme of DPSGD at the (t + 1)-th iteration is of the form θt+1 = θt -µ t • 1 n i∈S t g t (x i ) max(1, g t (xi) 2 C ) + ξ1 ,



[L(θ; x, y)] .

