LOWER BOUNDS FOR DIFFERENTIALLY PRIVATE ERM: UNCONSTRAINED AND NON-EUCLIDEAN

Abstract

We consider the lower bounds of differentially private empirical risk minimization (DP-ERM) for convex functions in both constrained and unconstrained cases concerning the general p norm beyond the 2 norm considered by most of the previous works. We provide a simple black-box reduction approach that can generalize lower bounds in constrained to unconstrained cases. Moreover, for ( , δ)-DP, we achieve the optimal Ω( ) lower bounds for both constrained and unconstrained cases and any p geometry where p ≥ 1 by considering 1 loss over the ∞ ball.

1. INTRODUCTION

Since the seminal work of Dwork et al. (2006) , differential privacy (DP), defined below, has become the standard and rigorous notion of privacy guarantee for machine learning algorithms. In DP-ERM, we are given a family convex functions where each function (•; z) is defined on a convex set K ⊆ R d , and a data-set D = {z 1 , • • • , z n } to design a differentially private algorithm that can minimize the loss function L(θ; D) = 1 n n i=1 (θ; z i ), and the value L(θ; D) -min θ ∈K L(θ ; D) is called the excess empirical loss with respect to solution θ, measuring how it compares with the best solution in K. DP-ERM in the constrained case and Euclidean geometry (with respect to 2 norm) was studied first, well-studied, and most of the previous literature belongs to this case. More specifically, the Euclidean constrained case considers convex loss functions defined on a bounded convex set C R d , assuming the functions are 1-Lipschitz over the convex set of diameter 1 with respect to the 2 norm. 



When δ > 0, we may refer to it as approximate-DP, and we name the particular case when δ = 0 pure-DP sometimes.



Definition 1.1 (Differential privacy). A randomized mechanism M is ( , δ)-differentially private 1 if for any event O ∈ Range(M) and for any neighboring databases D and D that differ by a single data element, one has Pr[M(D) ∈ O] ≤ exp( ) Pr[M(D ) ∈ O] + δ. Among the rich literature on DP, many fundamental problems are based on empirical risk minimization (ERM), and DP-ERM becomes one of the most well-studied problems in the DP community. See e.g., Chaudhuri & Monteleoni (2008); Rubinstein et al. (2009); Chaudhuri et al. (2011); Kifer et al. (2012); Song et al. (2013); Bassily et al. (2014); Jain & Thakurta (2014); Talwar et al. (2015); Kasiviswanathan & Jin (2016); Fukuchi et al. (2017); Wu et al. (2017); Zhang et al. (2017); Wang et al. (2017); Iyengar et al. (2019); Bassily et al. (2020); Kulkarni et al. (2021); Asi et al. (2021); Bassily et al. (2021b); Wang et al. (2021); Bassily et al. (2021a); Gopi et al. (2022); Arora et al. (2022); Ganesh et al. (2022).

For pure-DP (i.e. ( , 0)-DP), the seminal work Bassily et al. (2014) achieved tight upper and lower bounds Θ( d n ). As for approximate-DP (i.e. ( , δ)-DP when δ > 0), previous works Bassily et al.

