INDIVIDUAL PRIVACY ACCOUNTING WITH GAUSSIAN DIFFERENTIAL PRIVACY

Abstract

Individual privacy accounting enables bounding differential privacy (DP) loss individually for each participant involved in the analysis. This can be informative as often the individual privacy losses are considerably smaller than those indicated by the DP bounds that are based on considering worst-case bounds at each data access. In order to account for the individual privacy losses in a principled manner, we need a privacy accountant for adaptive compositions of randomised mechanisms, where the loss incurred at a given data access is allowed to be smaller than the worst-case loss. This kind of analysis has been carried out for the Rényi differential privacy by Feldman and Zrnic (12), however not yet for the so called optimal privacy accountants. We make first steps in this direction by providing a careful analysis using the Gaussian differential privacy which gives optimal bounds for the Gaussian mechanism, one of the most versatile DP mechanisms. This approach is based on determining a certain supermartingale for the hockey-stick divergence and on extending the Rényi divergence-based fully adaptive composition results by Feldman and Zrnic (12). We also consider measuring the individual (ε, δ)-privacy losses using the so called privacy loss distributions. With the help of the Blackwell theorem, we can then make use of the results of Feldman and Zrnic (12) to construct an approximative individual (ε, δ)-accountant.

1. INTRODUCTION

Differential privacy (DP) (8) provides means to accurately bound the compound privacy loss of multiple accesses to a database. Common differential privacy composition accounting techniques such as Rényi differential privacy (RDP) based techniques (23; 33; 38; 24) or so called optimal accounting techniques (19; 15; 37) require that the privacy parameters of all algorithms are fixed beforehand. Rogers et al. (28) were the first to analyse fully adaptive compositions, wherein the mechanisms are allowed to be selected adaptively. Rogers et al. (28) introduced two objects for measuring privacy in fully adaptive compositions: privacy filters, which halt the algorithms when a given budget is exceeded, and privacy odometers, which output bounds on the privacy loss incurred so far. Whitehouse et al. (34) have tightened these composition bounds using filters to match the tightness of the so called advanced composition theorem (9). Feldman and Zrnic (12) obtain similar (ε, δ)-asymptotics via RDP analysis. In their analysis using RDP, Feldman and Zrnic (12) consider individual filters, where the algorithm stops releasing information about the data elements that have exceeded a pre-defined RDP budget. This kind of individual analysis has not yet been considered for the optimal privacy accountants. We make first steps in this direction by providing a fully adaptive individual DP analysis using the Gaussian differential privacy (7). Our analysis leads to tight bounds for the Gaussian mechanism and it is based on determining a certain supermartingale for the hockey-stick divergence and on using similar proof techniques as in the RDP-based fully adaptive composition results of Feldman and Zrnic (12). We note that the idea of individual accounting of privacy losses has been previously considered in various forms by, e.g., Ghosh and Roth ( 13 We also consider measuring the individual (ε, δ)-privacy losses using the so called privacy loss distributions (PLDs). Using the Blackwell theorem, we can in this case rely on the results of ( 12) to construct an approximative (ε, δ)-accountant that often leads to smaller individual ε-values than commonly used RDP accountants. For this accountant, evaluating the individual DP-parameters using the existing methods requires computing FFT at each step of the adaptive analysis. We speed up this computation by placing the individual DP hyperparameters into well-chosen buckets, and by using pre-computed Fourier transforms. Moreover, by using the Plancherel theorem, we obtain a further speed-up.

1.1. OUR CONTRIBUTIONS

Our main contributions are the following: • We show how to analyse fully adaptive compositions of DP mechanisms using the Gaussian differential privacy. Our results give tight (ε, δ)-bounds for compositions of Gaussian mechanisms and are the first results with tight bounds for fully adaptive compositions. • Using the concept of dominating pairs of distributions and by utilising the Blackwell theorem, we propose an approximative individual (ε, δ)-accountant that in several cases leads to smaller individual ε-bounds than the individual RDP analysis. • We propose efficient numerical techniques to compute individual privacy parameters using privacy loss distributions (PLDs) and the FFT algorithm. We show that individual ε-values can be accurately approximated in O(n)-time, where n is the number of discretisation points for the PLDs. Due to the lack of space this is described in Appendix D. • We give experimental results that illustrate the benefits of replacing the RDP analysis with GDP accounting or with FFT based numerical accounting techniques. As an observation of indepedent interest, we notice that individual filtering leads to a disparate loss of accuracies among subgroups when training a neural network using DP gradient descent.

2.1. DIFFERENTIAL PRIVACY

We first shortly review the required definitions and results for our analysis. For more detailed discussion, see e.g. ( 7) and ( 37). An input dataset containing N data points is denoted as X = (x 1 , . . . , x N ) ∈ X N , where x i ∈ X , 1 ≤ i ≤ N . We say X and X ′ are neighbours if we get one by adding or removing one element in the other (denoted X ∼ X ′ ). To this end, similarly to Feldman and Zrnic (12), we also denote X -i the dataset obtained by removing element x i from X, i.e. X -i = (x 1 , . . . , x i-1 , x i+1 , . . . , x N ). A mechanism M is (ε, δ)-DP if its outputs are (ε, δ)-indistinguishable for neighbouring datasets. Definition 1. Let ε ≥ 0 and δ ∈ [0, 1]. Mechanism M : X n → O is (ε, δ)-DP if for every pair of neighbouring datasets X, X ′ , every measurable set E ⊂ O, P(M(X) ∈ E) ≤ e ε P(M(X ′ ) ∈ E) + δ. We call M tightly (ε, δ)-DP, if there does not exist δ ′ < δ such that M is (ε, δ ′ )-DP. The (ε, δ)-DP bounds can also be characterised using the Hockey-stick divergence. For α > 0 the hockey-stick divergence H α from a distribution P to a distribution Q is defined as H α (P ||Q) = [P (t) -α • Q(t)] + dt, where for x ∈ R, [x] + = max{0, x}. Tight (ε, δ)-values for a given mechanism can be obtained using the hockey-stick-divergence:



); Ebadi et al. (10); Wang (32); Cummings and Durfee (6); Ligett et al. (22); Redberg and Wang (27).

