ANALYTICAL COMPOSITION OF DIFFERENTIAL PRI-VACY VIA THE EDGEWORTH ACCOUNTANT

Abstract

Many modern machine learning algorithms are in the form of a composition of simple private algorithms; thus, an increasingly important problem is to efficiently compute the overall privacy loss under composition. In this paper, we introduce the Edgeworth Accountant, an analytical approach to composing differential privacy guarantees of private algorithms. The Edgeworth Accountant starts by losslessly tracking the privacy loss under composition using the f -differential privacy framework (Dong et al., 2022), which allows us to express the privacy guarantees using privacy-loss log-likelihood ratios (PLLRs). As the name suggests, this accountant next uses the Edgeworth expansion (Hall, 2013) to upper and lower bound the probability distribution of the sum of the PLLRs. Moreover, by relying on a technique for approximating complex distributions by simple ones, we demonstrate that the Edgeworth Accountant can be applied to composition of any noise-addition mechanism. Owing to certain appealing features of the Edgeworth expansion, the (ε, δ)-differential privacy bounds offered by this accountant are non-asymptotic, with essentially no extra computational cost, as opposed to the prior approaches in Koskela et al. (2020); Gopi et al. (2021), in which the running times are increasing with the number of compositions. Finally, we show our upper and lower (ε, δ)-differential privacy bounds are tight in certain regimes of training private deep learning models and federated analytics.

1. INTRODUCTION

Differential privacy (DP) provides a mathematically rigorous framework for analyzing and developing private algorithms working on datasets containing sensitive information about individuals (Dwork et al., 2006) . This framework, however, is often faced with challenges when it comes to analyzing the privacy loss of complex algorithms such as privacy-preserving deep learning and federated analytics (Ramage & Mazzocchi, 2020; Wang et al., 2021) , which are composed of simple private building blocks. Therefore, a central question in this active area is to understand how the overall privacy guarantees degrade from the repetition of simple algorithms applied to the same dataset. Continued efforts to address this question have led to the development of relaxations of differential privacy and privacy analysis techniques (Dwork et al., 2010; Dwork & Rothblum, 2016; Bun et al., 2018; Bun & Steinke, 2016) . A recent flurry of activity in this line of research was triggered by Abadi et al. (2016) , which proposed a technique called moments accountant for providing upper bounds on the overall privacy loss of training private deep learning models over iterations. A shortcoming of moments accountant is that the privacy bounds are generally not tight, albeit computationally efficient. This is because this technique is enabled by Rényi DP in Mironov (2017) and its following works (Balle et al., 2018; Wang et al., 2019) , whose privacy loss profile can be lossy for many mechanisms. Alternatively, another line of works directly compose (ε, δ)-DP guarantees via numerical methods such as the fast Fourier transform (Koskela et al., 2020; Gopi et al., 2021) . This approach can be computationally expensive, as the number of algorithms under composition is huge, which unfortunately is often the case for training deep neural networks. Instead, this paper aims to develop computationally efficient lower and upper privacy bounds for the composition of private algorithms with finite-sample guaranteesfoot_0 by relying on a new privacy defini- (2015) , and enables a precise tracking of the privacy loss under composition using a certain operation between the functional privacy parameters. Moreover, Dong et al. ( 2022) developed an approximation tool for evaluating the overall privacy guarantees using a central limit theorem (CLT), which can lead to approximate (ε, δ)-DP guarantees using the duality between (ε, δ)-DP and Gaussian Differential Privacy (GDP, a special type of f -DP) (Dong et al., 2022) . While the (ε, δ)-DP guarantees are asymptotically accurate, a usable finite-sample guarantee is lacking in the f -DP framework. f -DP approximate GDP (ε, δ(ε))-DP ( ε CLT , δ CLT )-DP ( ε EW , δ EW )-DP In this paper, we introduce the Edgeworth Accountant as an analytically efficient approach to obtaining finite-sample (ε, δ)-DP guarantees by leveraging the f -DP framework. In short, the Edgeworth Accountant makes use of the Edgeworth approximation (Hall, 2013), which is a refinement to the CLT with a better convergence rate, to approximate the distribution of the sum of certain random variables that we refer to as privacy-loss log-likelihood ratios (PLLRs). By leveraging a Berry-Esseen type bound derived for the Edgeworth approximation, we obtain non-asymptotic upper and lower privacy bounds that are applicable to privacy-preserving deep learning and federated analytics. On a high level, we compare the approach of our Edgeworth Accountant to the Gaussian Differential Privacy approximation in Figure 1 . Additionally, we note that while the rate of the Edgeworth approximation is well conceived, the explicit finite-sample error bounds are highly non-trivial. To the best of our knowledge, it is the first time such a bound has been established in the statistics and differential privacy communities and it is also of interest on its own. We have made available two versions of our Edgeworth Accountant to better fulfill practical needs: the approximate Edgeworth Accountant (AEA), and the exact Edgeworth Accountant interval (EEAI). The AEA can give an estimate with asymptotically accurate bound for any number of composition m. By using higher-order Edgeworth expansion, such an estimate can be arbitrarily accurate, provided that the Edgeworth series converges, and therefore it is useful in practice to quickly estimate privacy parameters. As for the EEAI, it provides an accurate finite-sample bound for any m. It gives a rigorous bound on the privacy parameters efficiently. Our proposal is very important as an efficiently computable DP-accountant. For the composition of m identical mechanisms, our algorithm runs in O(1) time to compute the privacy loss, and for the general case when we need to compose m heterogeneous algorithms, the runtime becomes O(m), which is information-theoretically optimal. In contrast, fast Fourier transform (FFT)-based algorithms (Gopi et al., 2021) provide accurate finite-sample bound, but can only achieve polynomial runtime for general composition of private algorithms. The suboptimal time-complexity of FFT-based methods leads to a large requirement of resources when m is large, and a large m is quite common in practice. For example, in deep learning and federated learning, m is the number of iterations (rounds) and can be potentially very large. To make things worse, in real-world applications, the same dataset is often adaptively used or shared among different tasks. To faithfully account for the privacy loss, the DP accountant system has to track the cost of each iteration across different tasks, further increasing the number of compositions. Our EEAI serves as the first DP accountant method that simultaneously provides finite-sample guarantees, performs with optimal time complexity, and is very accurate (when m is large), which can be a good supplement to the current toolbox.



Here, "sample" refers to the number of compositions of DP algorithms. From now on we use the term "finite-sample" to mean that the bound is non-asymptotic in the number of compositions.



Figure 1: The comparison between the GDP approximation in Dong et al. (2022), and our Edgeworth Accountant. Both methods start from the exact composition using f -DP. Upper: Dong et al. (2022) uses a CLT type approximation to get a GDP approximation to the f -DP guarantee, then converts it to (ε, δ)-DP via duality (Fact 1). Lower: We losslessly convert the f -DP guarantee to an exact (ε, δ(ε))-DP guarantee, with δ(ε) defined with PLLRs in (3.1), and then take the Edgeworth approximation to numerically compute the (ε, δ)-DP.

