OPTIMAL MEMBERSHIP INFERENCE BOUNDS FOR ADAPTIVE COMPOSITION OF SAMPLED GAUSSIAN MECHANISMS Anonymous

Abstract

Given a trained model and a data sample, membership-inference (MI) attacks predict whether the sample was in the model's training set. A common countermeasure against MI attacks is to utilize differential privacy (DP) during model training to mask the presence of individual examples. While this use of DP is a principled approach to limit the efficacy of MI attacks, there is a gap between the bounds provided by DP and the empirical performance of MI attacks. In this paper, we derive bounds for the advantage of an adversary mounting a MI attack, and demonstrate tightness for the widely-used Gaussian mechanism. Our analysis answers an open problem in the field of differential privacy, namely the fact that membership inference is not 100% successful even for relatively high budgets ( > 10). Finally, using our analysis, we provide MI metrics for models trained on CIFAR10 dataset. To the best of our knowledge, our analysis provides the state-of-the-art membership inference bounds.

1. INTRODUCTION

The recent success of machine learning models makes them the go-to approach to solve a variety of problems, ranging from computer vision (Krizhevsky et al., 2012) to NLP (Sutskever et al., 2014) , including applications to sensitive data such as health records or chatbots. Access to a trained machine learning model, be it a white-box published model or through a black-box API, can leak traces of information (Dwork et al., 2015) from the training data. Researchers have tried to measure this information leakage through metrics such as membership inference (Shokri et al., 2017) . Membership inference is the task of guessing, from a trained model, whether it includes a given sample or not. This task is interesting in its own right, as the participation of an individual in a data collection can be sensitive information. Furthermore, it also serves as the "most significant bit" of information: if membership inference fails, attacks revealing more information such as reconstruction attacks (Fredrikson et al., 2014; Carlini et al., 2020) will also fail. In other words, defending against membership inference attacks also defends against more advanced attacks. The standard approach to provably defeat these membership privacy attacks is differential privacy (Dwork et al., 2006) . The traditional variant of differential privacy defines a class of algorithms that apply on a database D and respect a privacy budget and a probability of failure δfoot_0 . These parameters control the divergence between the distribution of outcome of the algorithm when applied on two "neighboring" databases that only differ in one location. In particular, our mechanism of interest, the Sampled Gaussian Mechanism (SGM), makes an arbitrary function f : X * → R d differentially private by first sub-sampling a database of smaller size from the original database, using Poisson sampling, and then calculating f on the sub-sampled database, and finally adding Gaussian noise to the outcome. Increasing the amount of injected noise provides stronger privacy guarantees, and enables to trade off between privacy and utility of the trained model. To measure the privacy of a given algorithm, researchers have developed advanced mathematical tools and notions such as Renyi differential privacy (Mironov, 2017; Abadi et al., 2016) and advanced composition theorems (Dwork et al., 2010; Kairouz et al., 2015) . These tools allow us to calculate ( , δ) values for carefully designed algorithms. 2022) prove that any model trained with ( , δ) differential privacy will induce an upper bound on the accuracy of membership inference. These upper bounds help practitioners calibrate the privacy budget to defeat the adversary. As can be seen in Figure 1 , there is a large gap between the best existing upper bound and the performance of empirical attacks. In this work, we use a new proof technique to bypass DP analysis and directly bound the advantage of MI attacks, and close the gap between theoretical guarantees and empirical performance. The most widely-used DP algorithm in machine learning applications is DP-SGD: it is a small change of the classical stochastic gradient descent algorithm that only requires to clip per-sample gradients, average them and add Gaussian noise. Each iteration of DP-SGD is an instance of the sampled Gaussian mechanism, which chooses a fraction q of a dataset and outputs a noisy sum of the desired quantity. DP-SGD has been shown to be more accurate than other differentially private training algorithms in the case of linear and convex models (van der Maaten and Hannun, 2020). Our analysis mirrors the recent shift in the field of empirical membership inference, from advantage (or accuracy) metrics (Shokri et al., 2017; Yeom et al., 2018; Sablayrolles et al., 2019) to precision/recall (true positive/false positive) measures (Watson et al., 2022; Carlini et al., 2020) . Our result shows that no single parameter (either or membership inference advantage) can explain the performance of adversaries in different settings. in other words, the same value for can lead to significantly different membership inference advantages and vice versa. As shown in Figure 1 .c and 1.d, this phenomenon holds for the entire precision/recall curve; the precision of adversary at all given recall values cannot be explained only with a or advantage parameter. Our contributions in this work are as follows:



While intuitive, describing δ as the probability of failure is mathematically inaccurate .Meiser (2018)



With subsampling (q = 0.01) (c) False positive vs true positive rate (d) FP v.s. TP for small FP rates

Figure 1: Comparison of advantage bounds in a Gaussian setup. We run the sampled Gaussian Mechanism in one dimension with 50 epochs, and either no subsampling (a) or subsampling with q = 0.01 (b). Our attack compares likelihood of the observations under both hypothesis, and chooses the most likely (see Appendix F.1 for details). We compute the empirical advantage by measuring the percentage of time p the adversary guesses membership correctly, and report 2p -1. In two bottom figures (c and d) we compare upper bounds for true positive rate obtained from differential privacy analysis versus our direct analysis (Details are provided in Appendix A).Previous work has shown that any differentially private algorithm will provably bound the accuracy of any membership inference adversary. Specifically,Yeom et al. (2018), Humphries et al. (2020), and  Thudi et al. (2022)  prove that any model trained with ( , δ) differential privacy will induce an upper bound on the accuracy of membership inference. These upper bounds help practitioners calibrate the privacy budget to defeat the adversary. As can be seen in Figure1, there is a large gap between the best existing upper bound and the performance of empirical attacks. In this work, we use a new proof technique to bypass DP analysis and directly bound the advantage of MI attacks, and close the gap between theoretical guarantees and empirical performance.

