ON THE UNIVERSALITY OF LANGEVIN DIFFUSION FOR PRIVATE EUCLIDEAN (CONVEX) OPTIMIZATION Anonymous

Abstract

In this paper, we revisit the problem of differentially private empirical risk minimization (DP-ERM) and differentially private stochastic convex optimization (DP-SCO). We show that a well-studied continuous time algorithm from statistical physics, called Langevin diffusion (LD), simultaneously provides optimal privacy/utility trade-offs for both DP-ERM and DP-SCO, under ε-DP, and (ε, δ)-DP both for convex and strongly convex loss functions. We provide new time and dimension independent uniform stability properties of LD, with which we provide the corresponding optimal excess population risk guarantees for ε-DP. An important attribute of our DP-SCO guarantees for ε-DP is that they match the non-private optimal bounds as ε → ∞.

1. INTRODUCTION

Over the last decade, there has been significant progress in providing tight upper and lower bounds for differentially private empirical risk minimization (DP-ERM) (Chaudhuri et al., 2011; Kifer et al., 2012; Bassily et al., 2014; Song et al., 2013; McMahan et al., 2017; Smith et al., 2017; Wu et al., 2017; Iyengar et al., 2019; Song et al., 2020; Chourasia et al., 2021) and differentially private stochastic optimization (DP-SCO) (Bassily et al., 2019; Feldman et al., 2020; Bassily et al., 2020; Kulkarni et al., 2021; Gopi et al., 2022; Asi et al., 2021b) , both in the ε-DP setting and in the (ε, δ)-DP settingfoot_0 . While we know tight bounds for both DP-ERM and DP-SCO in the (ε, δ)-DP setting (Bassily et al., 2014; 2019) , the space is much less understood in the ε-DP setting (i.e., where δ = 0). First, to the best of our knowledge, tight DP-SCO bounds are not known for ε-DP. In this paper when we say a bound is tight for any problem, we implicitly always expect the bound to reach the optimal non-private bound (including polylogarithmic factors) for the same task as ε → ∞. Second, the algorithms for both DP-ERM and DP-SCO in the ε-DP setting are inherently different from the (ε, δ)-DP setting. While all the algorithms in the (ε, δ)-DP setting are based on DP variants of gradient descent (Bassily et al., 2014; 2019; Feldman et al., 2020; Bassily et al., 2020) , the best algorithms for ε-DP are based on a combination of exponential mechanism (McSherry & Talwar, 2007) and output perturbation (Chaudhuri et al., 2011) . Third, we know that as we move from ε to (ε, δ)-DP, for convex problems, we gain a polynomial improvement in the error bounds in terms of the model dimensionality, p. It is unknown if such an improvement is even possible when the loss functions are non-convex. In this work, we close these gaps in our understanding of DP-ERM and DP-SCO via the following contributions.



We only focus on -Lipschitz losses and the constraint set is bounded in the 2 -norm; the non-Eucledian setting(Talwar et al., 2015; Asi et al., 2021a; Bassily et al., 2021) are beyond the scope of this work.1

