

Abstract

Cloud based machine learning inference is an emerging paradigm where users share their data with a service provider. Due to increased concerns over data privacy, recent works have proposed using Adversarial Representation Learning (ARL) to learn a privacy-preserving encoding of sensitive user data before it is shared with an untrusted service provider. Traditionally, the privacy of these encodings is evaluated empirically as they lack formal guarantees. In this work, we develop a new framework that provides formal privacy guarantees for an arbitrarily trained neural network by linking its local Lipschitz constant with its local sensitivity. To utilize local sensitivity for guaranteeing privacy, we extend the Propose-Test-Release (PTR) framework to make it tractable for neural network based queries. We verify the efficacy of our framework experimentally on real-world datasets and elucidate the role of ARL in improving the privacy-utility tradeoff.

1. INTRODUCTION

The ethical and regulatory concerns around data privacy have become increasingly important with the adoption of machine learning (ML) across various sectors such as health, finance, and mobility. Although training ML models privately has seen tremendous progress (Abadi et al. (2016) ; Papernot et al. (2016) ; Du et al. (2021) ; Jordon et al. (2018) ) in the last few years, protecting privacy during the inference phase remains a challenge as these models get deployed by cloud based service providers. Cryptographic techniques (Ohrimenko et al. (2016) ; Knott et al. (2021) ; Mishra et al. (2020) ; Juvekar et al. (2018) ) address this challenge by performing computation over encrypted data. However, to combat the high computational cost of encryption techniques, alternative works have used ARL to suppress task irrelevant information from data. While ARL based techniques have shown promising empirical results, they lack formal privacy guarantees over obfuscated representations due to their use of Deep Neural Networks (DNNs) for achieving privacy. For the first time, we show how to give formal privacy guarantees for inference queries over arbitrarily trained (including ARL) DNNs. The key aspect of any ARL algorithm is an obfuscator which is trained to encode a user's private data such that an attacker can not recover the original data from its encoding. Achieving formal privacy guarantees for an obfuscator has remained elusive due to the non-convexity of the training objective of DNNs. In this work, we take a posthoc approach to guaranteeing privacy, where the privacy of data is evaluated after the obfuscator is learned. Because the obfuscator is trained for non-invertibility, we hypothesize that the obfuscator network should act as a contractive mapping, and hence, increase the stability of the function in its local neighborhood, i.e., reduce sensitivity. Therefore, we measure the stability of an adversarially learned obfuscator neural network, using Lipschitz constants, and link it with privacy properties. To exactly compute the local Lipschitz constant of a non-linear (ReLU) DNNs, we use LipMip (Jordan & Dimakis (2020) ), a mixed-integer programming based technique, and re-formulate the ARL pipeline to ensure the computational feasibility of calculating the Lipschitz constant. To draw a connection between the local Lipschitz constant and reconstruction privacy, we introduce a privacy definition that is a specific instantiation of a general dχ-privacy framework by Chatzikokolakis et al. (2013) . Instead of evaluating the global Lipschitz constant of DNNs, we evaluate the Lipschitz constant only in the local neighborhood of the user's sensitive data. We extend the Propose-Test-Release (PTR) (Dwork & Lei (2009) ) framework to formalize our local neighborhood based measurement of the Lipschitz constant. The scope of our paper is to provide privacy guarantees against reconstruction attacks for existing ARL techniques, i.e., our goal is not to develop a new ARL technique but rather to develop a formal privacy framework compatible with existing ARL techniques. A Majority of the ARL techniques protect either a sensitive attribute or reconstruction of the input. We only consider sensitive input in this work. We adopt a different threat model from that of traditional differential privacy (DP) (Dwork et al. (2014) ) because, as we explain later, protecting membership inference is at odds with private inference. Our threat model for the reconstruction attack is motivated by the use cases where a user may be willing to disclose coarse-grained information about their data but wants to prevent leakage of fine-grained information. Alternate threat models have been widely used in the privacy literature (Chatzikokolakis et al. (2013) ; Kifer & Machanavajjhala (2014) ; Andrés et al. (2013) ; Hannun et al. (2021) ). Furthermore, we only focus on protecting the privacy of data during the inference stage, and assume that ML models can be trained privately. Typically ARL techniques evaluate the privacy of their representations by empirically measuring the information leakage using a proxy adversary. Existing works(Srivastava et al. ( 2019 2021)) for measuring information leakage empirically. However, most of these works analyze specific obfuscation techniques and lack formal privacy definitions. In contrast, our work is agnostic to the design of the obfuscator as long as it is differentiable, and our definition is built upon a variant of DP -a widely used formal privacy framework. Our privacy definition and mechanism is built upon dχ-privacy (Chatzikokolakis et al. (2013) ) and PTR (Dwork & Lei (2009) ). Existing instantiations of dχ-privacy include geo-indistinguishability(Andrés et al. ( 2013)) and location-dependent privacy (Koufogiannis & Pappas (2016) ) that share a similar goal as ours of sharing coarse grained information. Our work differs in its usage of neural network queries and high dimensional data modality. We refer the reader to Appendix Sec A for a detailed literature review. In Sec 2 we begin with the preliminaries of DP and its variant for metric spaces. Then, we motivate our ML inference setup and introduce our privacy definition in Sec 3. Next, we construct our posthoc framework by extending PTR and proving its privacy guarantees in Sec 4. In Sec 5 we experimentally demonstrate the feasibility of our framework and understand the dynamics of ARL algorithms. Our contributions can be summarized as follows: • We introduce (ϵ, δ, R)-neighborhood privacy definition to formalize reconstruction privacy for ARL based inference. • We extend the PTR framework to make it tractable for neural network based queries. Our extension bridges the gap between formal privacy frameworks and empirical techniques in private ML inference. • We perform extensive experimental analysis on ARL techniques and provide insight into how ARL improves the privacy-utility tradeoff by reducing the local sensitivity of DNNs.

2. PRELIMINARIES

Differential privacy (DP) (Dwork et al. (2014) ) is a widely used framework for answering a query, f , on a dataset x ∈ χ by applying a mechanism M(•) such that the probability distribution of the output of the mechanism M(x) is similar regardless the presence or absence of any individual in the dataset x. More formally, M satisfies (ϵ, δ)-DP if ∀x, x ′ ∈ χ such that d H (x, x ′ ) ≤ 1, and for all (measurable) output S over the range of M P(M(x) ∈ S) ≤ e ϵ P(M(x ′ ) ∈ S) + δ, where d H is the hamming distance. This definition is based on a trusted central server model, where a trusted third party collects sensitive data and computes M(x) to share with untrusted parties. In local-DP(Kasiviswanathan et al. ( 2011)), this model has been extended such that each user shares M(x), and the service provider is untrusted. Our threat model is a special case of local DP which we refer to as single-instance sharing. In this setup, the client queries every data instance independently with the service provider and there is no aggregation or summary statistic involved. For ex.-a user shares a face image to receive an age prediction from the service provider. While our setup We project a high dimensional data instance to a lower dimensional embedding. The goal of the embedder is to measure a semantically relevant distance between different instances. The embedding is fed to the Obfuscator that compresses similar inputs in a small volume. In traditional ARL, the obfuscated instance is shared with the untrusted service provider without any formal privacy guarantee. In this work, by analyzing the stability of the obfuscator network, we perturb the obfuscated instance to provide a formal privacy guarantee. is similar to item-level local DP, the answer to the query depends exactly on a single input. We note that d H (x, x ′ ) ≤ 1, ∀x, x ′ ∈ χ , whenever single-instance sharing is involved. Informally, this notion of neighboring databases under the DP definition would suggest that the outcome of two individuals should be similar no matter how different their datum is. This privacy definition could be too restrictive for our ML inference application where the data instance necessarily needs a certain degree of distinguishability to obtain utility from the service provider. This observation is formalized in the impossibility result of instance encoding (Carlini et al. (2020) ) for private learning. To subside this fundamental conflict between the privacy definition and our application, we look at the definition of dχ-privacy (Chatzikokolakis et al. (2013) ) that generalizes the DP definition to a general distance metric beyond hamming distance as follows: P(M(x) ∈ S) ≤ e dχ(x,x ′ ) P(M(x ′ ) ∈ S), here dχ(x, x ′ ) is a function that gives a level of indistinguishability between two datasets x and x ′ . DP can be viewed as a special case of dχ-privacy by keeping dχ(x, x ′ ) = ϵd H (x, x ′ ). Choosing a different distance metric yields stronger or weaker privacy guarantee.

3. PRIVACY DEFINITION

In order to formalize reconstruction privacy, we hypothesize that semantically similar points are close to each other on a data manifold, i.e. semantically similar samples are closer in a space where distances are defined as geodesic on the manifold. Therefore, one way to bound the reconstruction of x is by making it indistinguishable among semantically similar points. The extent of reconstruction privacy would therefore depend upon the radius of the neighborhood. We formalize it by introducing a privacy parameter R that allows a user to control how big this indistinguishable neighborhood should be. This formulation leads to two additional constraints -i) a distance metric that models low dimensional manifold space of data; ii) a privacy definition that incorporates the privacy parameter R as well as the distance metric. We propose to use embedding based manifold learning techniques (Brehmer & Cranmer (2020) ; Horvat & Pfister (2021) ) for the first constraint because we do not have a closed form expression for the manifold chart for real world data. We refer to the distance metric as d β θ (x, x ′ ), where the parameter θ is learned to model the data manifold and β is a standard norm such as ℓ 1 , ℓ 2 . Intuitively, we want to compute distances in a space where semantically similar data points are closer and semantically different data points are farther apart. For high dimensional datasets that lie over a low dimensional manifold (such as images), traditional distance metrics like ℓ 1 , ℓ 2 norms do not capture the semantic similarity. This idea has been used in perceptual similarity for computer vision (Zhang et al. (2018) ) as well as manifold and metric learning techniques (Brehmer & Cranmer (2020) ; Horvat & Pfister (2021) ; Kha Vu (2021) ). Hence, we instantiate dχ-privacy by keeping dχ(x, x ′ ) = ϵd β θ (x, x ′ ) such that, P(M(x) ∈ S) ≤ e ϵd β θ (x,x ′ ) P(M(x ′ ) ∈ S) + δ. The privacy parameter ϵ describes the extent of indistinguishability and the parameter R describes the neighborhood in which we obtain this indistinguishability. We note that dχ-privacy, unlike DP, does not use the notion of a neighborhood (d β θ (x, x ′ ) ≤ R) because the guarantee holds for any possible pair of x, x ′ ∈ χ and smoothly decays with distance. Finally, we slightly weaken the dχ-privacy instantiation by defining neighborhood as d β θ (x, x ′ ) ≤ R and keeping fixed levels of indistinguishability d β θ (x, x ′ ) ≤ R. Definition 1. A mechanism M satisfies (ϵ, δ, R)-semantic neighborhood privacy if ∀x, x ′ ∈ χ s.t. d β θ (x, x ′ ) ≤ R and S ⊆ Range(M) P(M(x) ∈ S) ≤ e ϵ P(M(x ′ ) ∈ S) + δ. Note that the above equation is exactly the same as (ϵ, δ)-DP except for the definition of neighboring databases. In this way, our privacy definition can be seen as a mix of standard DP and dχ-prviacy. A key characteristic of Eq 2 is that points closer to a given x than R enjoy higher indistinguishability while in Eq 3 all points in the neighborhood of x, similar to DP, get the same level of indistinguishability.

3.1. COMPARISON WITH DIFFERENTIAL PRIVACY

Conceptually, usage of hamming distance in DP for neighboring databases provides a level of protection such that the output does not change significantly regardless of the chosen sample. Such a privacy requirement can be at odds with the goal of prediction that necessarily requires discrimination between samples belonging to different concept classes. Our privacy definition relaxes this dichotomy by using a distance metric in the embedding space and guarantees privacy only within a neighborhood. The size of the neighborhood is a privacy parameter R such that higher value of R provides higher privacy. This privacy parameter is equivalent to the group size (Dwork et al. (2014) ) used sometimes in the DP literature. By default, this value is kept 1 in DP but can be kept higher if a group of individuals (family, community) have to be privatized instead of a single individual. There is an equivalence between the group privacy definition and standard DP definition which we state informally -Lemma 2.2 in Vadhan (2017): Any (ϵ, δ)-differentially private mechanism is (Rϵ, Re (R-1)ϵ δ)differentially private for groups of size R. This lemma also applies to our proposed definition. However, we emphasize that privacy parameters of (ϵ, δ)-DP mechanism can not be compared trivially with a (ϵ, δ, R)-semantic neighborhood privacy mechanism because same value of ϵ and δ provide different levels of protection due to different definitions of neighboring databases. We experimentally demonstrate this claim in Sec 5.

4. PRIVACY MECHANISM

Our goal is to design a framework that can provide a formal privacy guarantee for single data instance sharing that is informally privatized using ARL. However, ARL algorithms use non-linear neural networks trained on non-convex objectives making it difficult to perform any worst-case analysis. Therefore, we take a posthoc approach where the formal privatization is performed after the model is trained. Specifically, we apply propose-test-release (PTR) mechanism by Dwork & Lei (2009) . Applying PTR directly to our query (ARL) is not computationally feasible because PTR requires estimating local sensitivity at multiple points whereas evaluating local sensitivity of a neural network query is not even feasible at a single point. Therefore, we design a tractable variant of PTR that utilizes local lipschitz constant estimator to compute privacy related parameters. We refer the reader to Appendix for a detailed discussion on the lipschitz constant estimation and PTR. Conventionally, ARL algorithms have three computational blocks during the training stage: 1) obfuscator (f (•)) that generates a (informally private) representation (z) of data, 2) proxy adversary that reconstructs the data from the representation produced by the obfuscator, and 3) classifier that performs the given task using the obfuscated representation. The classifier and proxy adversary are trained to minimize the task loss and reconstruction loss, respectively. The obfuscator is trained to minimize the task loss but maximize the reconstruction loss. This setup results in a min-max optimization where the trade-off between task performance and reconstruction is controlled by a hyper-parameter α. Note that some techniques(Oh et al. ( 2016 Our framework applies the mechanism M such that the final released data ẑ = M(x) has a privacy guarantee discussed in Eq 3. Like PTR, we start with a proposal (∆ p LS ) on the upper bound of the local sensitivity of x. To test the validity of ∆ p LS , we compute the size of the biggest possible neighborhood such that the local Lipschitz constant of the obfuscator network in the neighborhood is lesser than the proposed bound. Next, we privately verify the correctness of the proposed bound for the given data instance. We do not release the data (denoted by ⊥) if the proposed bound is invalid. Otherwise, we perturb the data using Laplace distribution calibrated by the proposed bound. Next, we discuss the framework and privacy guarantees in more detail. Global Sensitivity and Lipschitz constant of a query f : χ → Y are essentially same in the dχ-privacy framework. Global sensitivity of a query f (•) is the smallest value of ∆ (if it exists) such that ∀x, x ′ ∈ χ , d Y (f (x), f (x ′ )) ≤ ∆dχ(x, x ′ ). While global sensitivity is a measure over all possible pairs of data in the data domain χ , local sensitivity (∆ LS ) is defined with respect to a given dataset x such that ∀x ′ ∈ χ , d Y (f (x), f (x ′ )) ≤ ∆ LS (x)dχ(x, x ′ ). We integrate the notion of semantic similarity in a neighborhood (described in Sec 2) by defining the local sensitivity of a neighborhood N (x, R) around x of radius R such that N (x, R) = {x ′ |dχ(x, x ′ ) ≤ R, ∀x ′ ∈ χ }. Therefore, the local sensitivity of query f on x in the R-neighborhood is defined ∀x ′ ∈ N (x, R) such that d Y (f (x), f (x ′ )) ≤ ∆ LS (x, R)dχ(x, x ′ ). (4) We note that if dχ is hamming distance and R is 1 then this formulation is exactly same as local sensitivity in ϵ-DP (Dwork et al. (2014) ). The equation above can be re-written as: ∆ LS (x, R) = sup x ′ ∈N (x,R) d Y (f (x), f (x ′ )) dχ(x, x ′ ) . ( ) This formulation of local sensitivity is similar to the definition of the local Lipschitz constant. The local Lipschitz constant L of f for a given open neighborhood N ⊆ χ is defined as follows: L α,β (f, N ) = sup x ′ ,x ′′ ∈N ||f (x ′ ) -f (x ′′ )|| α ||x ′ -x ′′ || β (x ′ ̸ = x ′′ ) We note that while the local sensitivity of x is described around the neighborhood of x, the Lipschitz constant is defined for every possible pair of points in a given neighborhood. Therefore, in Lemma 4.1 we show that the local Lipschitz in the neighborhood of x is an upper bound on the local sensitivity. Lemma 4.1. For a given f and for d Y ← ℓ α and dχ ← ℓ β , ∆ LS (x) ≤ L(f, N (x, R)). Proof in Appendix B.1. Since local sensitivity is upper bounded by the Lipschitz constant, evaluating the Lipschitz constant suffices as an alternative to evaluating local sensitivity. Lower bound on testing the validity of ∆ p LS : The PTR algorithm (Dwork & Lei (2009) ) suggests a proposal on the upper bound (∆ p LS ) of local sensitivity and then finds the distance between the given dataset (x) and the closest dataset for which the proposed upper bound is not valid. Let γ(•) be a distance query and ∆ LS (x) be the local sensitivity defined as per the DP framework with respect to x such that γ(x) = min x ′ ∈ χ {d H (x, x ′ ) s.t. ∆ LS (x ′ ) > ∆ p LS }. In our framework, the query γ(x, R) can be formulated in the semantic neighborhood as follows: γ(x, R) = min x ′ ∈ χ {dχ(x, x ′ ) s.t. ∆ LS (x ′ , R) > ∆ p LS }. We note that keeping dχ = d H and R = 1, makes the γ query exactly same as defined in the eq 7. In our setup, computing γ(•) is intractable due to local sensitivity estimation required for every x ′ ∈ χ (which depends upon a non-linear neural network). We emphasize that this step is intractable at two levels, first we require estimating local sensitivity of a neural network query. Second, we require this local sensitivity over all samples in the data domain. Therefore, we make it tractable by computing a lower bound over γ(x, R) by designing a function ϕ(•) s.t. ϕ(x, R) ≤ γ(x, R). Intuitively, ϕ(•) finds the largest possible neighborhood around x such that the local Lipschitz constant of the neighborhood is smaller than the proposed local sensitivity. Because the subset of points around x whose neighborhood does not violate ∆ p LS is half of the size of the original neighborhood in the worst case, we return half of the size of neighborhood as the output. We describe its computation in Algorithm 1. More formally, ϕ(x, R) = 1 2 • arg max R ′ ≥R {L(f, N (x, R ′ )) ≤ ∆ p LS } If there is no solution to the equation above, then we return 0. Lemma 4.2. ϕ(x, R) ≤ γ(x, R). Proof in Appendix B.2. Privately testing the lower bound: The next step in the PTR algorithm requires testing if γ(x) ≤ ln( 1 δ )/ϵ. If the condition is true, then no-answer (⊥) is released instead of data. Since the γ query depends upon x, PTR privatizes it by applying laplace mechanism, i.e. γ(x) = γ(x) + Lap(1/ϵ). The query has a sensitivity of 1 since the γ could differ at most by 1 for any two neighboring databases. In our framework, we compute ϕ(x, R) to lower bound the value of γ(x, R). Therefore, we need to privatize the ϕ query. For general distance metrics in dχ-privacy, the global sensitivity of the ϕ(x) query is 1. Lemma 4.3. The query ϕ(•) has a global sensitivity of 1, i.e. ∀x, x ′ ∈ χ , d abs (ϕ(x, R), ϕ(x ′ , R)) ≤ dχ(x, x ′ ). Proof in Appendix B.3. After computing ϕ(x, R), we add noise sampled from a laplace distribution, i.e. φ(x, R) = ϕ(x, R) + Lap(R/ϵ). Next, we check if φ(x, R) ≤ ln( 1 δ ) • R/ϵ, then we release ⊥, otherwise we release ẑ = f (g(x)) + Lap(∆ p LS /ϵ). Next, we prove that the mechanism M 1 described above satisfies semantic neighborhood-privacy. Theorem 4.4. Mechanism M 1 satisfies uniform (2ϵ, δ/2, R)-semantic neighborhood privacy Eq. 3, i.e. ∀x, x ′ ∈ χ , s.t. dχ(x, x ′ ) ≤ R P(M(x) ∈ S) ≤ e 2ϵ P(M(x ′ ) ∈ S) + δ 2 Proof Sketch: Our proof is similar to the proof for the PTR framework (Dwork et al. (2014) ) except the peculiarity introduced due to our metric space formulation. First, we show that not releasing the answer (⊥) satisfies the privacy definition. Next, we divide the proof into two parts, when the proposed bound is incorrect (i.e. ∆ LS (x, R) > ∆ p LS ) and when it is correct. Let R be the output of query ϕ. P[ φ(x, R) = R] P[ φ(x ′ , R) = R] = exp(-( |ϕ(x,R)-R| R • ϵ)) exp(-( |ϕ(x ′ ,R)-R| R • ϵ)) ≤ exp(|ϕ(x ′ , R) -ϕ(x, R)| • ϵ R ) ≤ exp(dχ(x, x ′ ) • ϵ R ) ≤ exp(ϵ) Therefore, using the post-processing property - P[M(x) = ⊥] ≤ e ϵ P[M(x ′ ) = ⊥]. Here, the first inequality is due to triangle inequality, the second one is due to Lemma 4.3 and the third inequality follows from dχ(x, x ′ ) ≤ R. Note that when ∆ LS (x, R) > ∆ p LS , ϕ(x, R) = 0. Therefore, the probability for the test to release the answer in this case is P[M(x) ̸ = ⊥] = P[ϕ(x, R) + Lap( R ϵ ) > log( 1 δ ) • R ϵ ] = P[Lap( R ϵ ) > log( 1 δ ) • R ϵ ] Based on the CDF of Laplace distribution, P[M(x) ̸ = ⊥] = δ 2 . Therefore, if ∆ LS (x, R) > ∆ p LS , for any S ⊆ R d ∪ ⊥ in the output space of M P[M(x) ∈ S] = P[M(x) ∈ S ∩ {⊥}] + P[M(x) ∈ S ∩ {R d }] MNIST (ϵ = 0, 0.10), (ϵ = ∞, 0.93) FMNIST (ϵ = 0, 0.10), (ϵ = ∞, 0.781) UTKFace (ϵ = 0, 0.502), (ϵ = ∞, 0.732) (2020) . Informal ϵ = 1 ϵ = 2 ϵ = 5 ϵ = 10 Informal ϵ = 1 ϵ = 2 ϵ = 5 ϵ = 10 Informal ϵ = 1 ϵ = 2 ϵ = 5 ϵ = ≤ e ϵ P[M(x ′ ) ∈ S ∩ {⊥}] + P[M(x) ̸ = ⊥] ≤ e ϵ P[M(x ′ ) ∈ S] + δ 2 If ∆ LS (x, R) ≤ ∆ p LS then the mechanism is a composition of two (ϵ, δ, R)-semantic neighborhood private algorithm where the first algorithm (ϕ(x, R)) is (ϵ, δ/2, R)-semantic neighborhood private and the second algorithm is (ϵ, 0, R)-private. Using composition, the algorithm is (2ϵ, δ/2, R)-semantic neighborhood private. We describe M 1 step by step in Algorithm 1. To summarize, we designed the posthoc privacy framework that extends the PTR framework by making it tractable to get (ϵ, δ, R)semantic neighborhood privacy. The exact local Lipschitz constant of the neural network based obfuscator is estimated using mixed-integer programming based optimization developed by Jordan & Dimakis (2020) . Computational feasibility: Our key idea is to add extra computation on the client side to formally reason about the privacy of shared data. This extra computational cost is due to the estimation of the local Lipschitz constant of the obfuscator network. However, three key factors of our framework make it practically feasible -1. We compute the local Lipschitz constant (i.e. in a small neighborhood around a given point): Our extension of the propose-test-release framework only requires us to operate in a small local neighborhood instead of estimating the global Lipschitz constant which would be much more computationally expensive. 2. Low number of parameters for obfuscator: Instead of estimating the Lipschitz constant of the whole prediction model, we only require estimation of the obfuscator -a neural network that has a significantly lower number of parameters in comparison to the prediction model. 3. Lower dimension of input embedding: Since we measure distance in the embedding space, the dimensions over which the local Lipschitz constant is estimated are significantly lower than the ambient data dimension. We performed an ablation study on all three aspects mentioned above in Sec 6. The fact that the local Lipschitz constant is being computed over the same obfuscator allows room for optimizing performance by caching. Our goal is to demonstrate the feasibility of bridging formal privacy guarantees and ARL-based mechanisms, hence, we did not explore such performance speedups.

5. EXPERIMENTS

Experimental Setup: We evaluate different aspects of our proposed framework -i) E1: comparison between different adversarial appraches, ii) E2: comparison with local differential privacy (LDP), iii) E3: computational tractability of our proposed framework, and iv) E4: investigating role of ARL in achieving privacy. We use MNIST(LeCun (1998)), FMNIST (Xiao et al. (2017) ) and UTKFace (Zhang et al. (2017) ) dataset for all experiments. All of them contain samples with extremely high ambient dimensions (MNIST-784, FMNIST-784 and UTKFace-4096). We use a deep CNN based β-VAE (Higgins et al. (2016) ) for the embedder. We use LipMip (Jordan & Dimakis (2020) ) for computing Lipschitz constant over ℓ ∞ norm in the input space and ℓ 1 norm in the output space. We baseline with a simple Encoder based approach where the data is projected to smaller dimensions using a neural network. This encoder type approach has been used in the literature as Split Learning Gupta & Raskar (2018) . For ARL, we use the proxy-adversary based min-max optimization used by several ARL techniques (Xiao et al. (2020) ; Singh et al. (2021) ; Liu et al. (2019) ; Li et al. (2021) ) and adversarial contrastive learning(Osia et al. ( 2020)) which we denote as C. We use noisy regularization (denoted by N) to improve classifier performance. We refer the reader to sec E for a detailed experimental setup, codebase and hyper-parameters. E1: Privacy-Utility Trade-off: Since our framework enables comparison between different obfuscation techniques under same privacy budget, we evaluate test set accuracy on three image datasets in Table 1 . Our results indicate that ARL complemented with contrastive and noise regularization helps in attaining overall best performance among all possible combinations. We note that the SoTA performance on all three datasets is higher than our experimental setup because of the usage of embedder that can be further improved to yield higher accuracy. E2: Comparison between ARL and LDP: While ϵ-LDP definition provides a different and stronger privacy guarantee than our proposed privacy definition, we compare the performance between ARL and LDP and report results in the Table 2 in the Appendix for the sake of completeness. Results indicate that for low value of ϵ, LDP techniques do not yield any practical utility. This observation corroborates with impossibility result of instance encoding Carlini et al. (2020) and our discussion in Sec 2 about applicability of traditional DP in the context of ARL. E3: Computational feasibility: Our framework relies upon the exact computation of Lipschitz constant of ReLU networks (obfuscator in our case) that has been shown to be a NP-hard problem (Jordan & Dimakis (2020) ). Our end-to-end runtime evaluation on a CPU based client results in a runtime of 2 sec/image (MNIST) and 3.5 sec/image (UTKFace). While plenty of room exists for optimizing this runtime, we believe current numbers serve as a reasonable starting point for providing formal privacy in ARL. As discussed in Sec 4, we compare computation time of the obfuscator across three factors relevant to our setup -i) Dimensionality of the input, ii) Size of the neighborhood, and iii) Number of layers in the Obfuscator. Figure 4 shows performance evaluation. While the running time quickly grows exponentially with input size, we emphasize that the obfuscator network requires only small number of dimensions due to its input residing on the embedding space. Results demonstrate that not only the framework is computationally tractable but it can be executed at a real-time speed for our inference use-case. E4: What role does ARL play in achieving privacy? In this experiment, we assess the contribution of adversarial training in improving privacy-utility trade-off. We train obfuscator models with different values of α (weighing factor) for adversarial training. Our results in Fig 2 indicate that higher weighing of adversarial regularization reduces the local lipschitz constant, hence reducing the local sensitivity of the neural network. Furthermore, for high values of α, the change in local lipschitz constant reduces significantly for different size (R) of the neighborhood. These two observations can potentially explain that ARL improves reconstruction privacy by reducing the sensitivity of the obfuscator. However, as we observe in Table 1 , the classifier can reduce its utility if ARL is not complemented with noisy and contrastive regularization. We believe this finding could be of independent interest for the adversarial defense community where the goal is to reduce misclassification performance of neural networks.

6. DISCUSSION

How to select privacy parameter R? One of the key difference between (ϵ, δ, R)-neighborhood privacy and (ϵ, δ)-DP is the additional parameter R. The choice of R depends upon the neighborhood in which a user wishes to get an ϵ level of indistinguishability. We perform reconstruction attacks on privatized encoding obtained from our framework by training an ML model to reconstruct original images. We compare reconstruction results for different values of ϵ and R on four distinct metrics for images in Table 4, neighborhoods of different R. We observe that as the boundary of the neighborhood increases, the images become perceptually different from the original image. For extremely large radii, the images change significantly enough that their corresponding label may change too. Such visualization can be used to semantically understand different values of R. How to propose ∆ p LS ? Our framework requires a proposal on the upper bound of local sensitivity in a private manner. One possible way to obtain ∆ p LS is by using the Lipschitz constant of training data samples used in training the obfuscator. To incorporate this notion of average, we choose ∆ p LS by first computing the mean (µ) and standard deviation (σ) of local sensitivity on the training dataset (assumed to be known under our threat model), then we keep ∆ p LS = µ + n * σ where n allows a trade-off between the likelihood of releasing the samples under PTR and adding extra noise to data. We used n = 3 in our experiments. Since empirically, the value of local sensitivity appears to be following a gaussian, using confidence interval serves as a good proxy. Fig 2 shows that for higher values of α, the variability in the local Lipschitz constant decreases indicating the validity of the bound would hold for a large number of samples. We emphasize that privacy parameters should be chosen independently of the private data otherwise the guarantees do not hold. Limitations: i) The distance metric (d β θ (x, x ′ )) is currently learned from data and could lead to irrelevant privacy guarantees if semantically similar points are farther apart in the embedding space. We believe this limitation could be addressed by understanding the convergence of these learned distance metrics. Furthermore, these learned distance metrics might be better than not assessing privacy formally at all or using distance metrics like ℓ 1 , ℓ 2 norm in the ambient dimension of data. ii) Since we utilize the PTR framework, outlier samples may not get released due to high sensitivity, this is expected since these outlier samples are likely to be misclassified anyway. iii) Lipschitz constant computation is limited to ReLU networks, therefore more sophisticated obfuscator architectures are currently not compatible with our proposed framework.

7. CONCLUSION

ML based approaches to private inference have been on a rise in the past few years owing to the powerful representational capacity of neural networks, especially for complex real-world datasets. Their main drawback is the lack of formal privacy guarantees. Our work has taken the first steps towards a formal privacy guarantee for a broad class of existing empirical techniques for privacy. We believe that our framework would foster more research in ARL techniques by improving privacyutility evaluation and take them closer to real world adoption.

A RELATED WORK

ARL techniques aim to learn a task-oriented privacy preserving encoding of data. Majority of the works in this area either protect against sensitive attribute leakage Hamm (2017); Roy & Boddeti (2019); Bertran et al. (2019) ; Li et al. (2018) or input reconstruction Samragh et al. (2021) ; Singh et al. (2021) ; Mireshghallah et al. (2021) ; Li et al. (2021) ; Liu et al. (2019) . These techniques usually evaluate their privacy using empirical attacks since the mechanism is learned using gradient based min-max optimization making it infeasible for the worst-case privacy analysis. The goal of our work is to make them amenable to formal privacy analysis. While theoretical analyses Zhao et al. (2020a; b) ; Sadeghi & Boddeti (2021) of ARL objectives have identified fundamental trade-offs between utility and attribute leakage, they are difficult to formalize as a worst-case privacy guarantee especially for deep neural networks. Privacy definitions that extend the DP definition to incorporate some of its limitations Kifer & Machanavajjhala (2011) include dχ-Privacy Chatzikokolakis et al. (2013) , and Pufferfish Kifer & Machanavajjhala (2014) . Our privacy definition is a specific instantiation of the dχ-Privacy Chatzikokolakis et al. (2013) framework that extends DP to general metric spaces. Our instantiation is focused on reconstruction privacy for individual samples instead of membership inference attacks Dwork et al. (2017) . Existing works in DP for reconstruction attacks Bhowmick et al. (2018) ; Stock et al. (2022) focus on the privacy of training data. Lipschitz constant estimation for neural networks has been used to guarantee network's stability to perturbations. Existing works either provide an upper bound Weng et al. (2018) ; Latorre et al. (2020); Fazlyab et al. (2019) , exact Lipschitz constant Jordan & Dimakis (2020; 2021) or Lipschitz constant regularization Scaman & Virmaux (2018) ; Huang et al. (2021) during the training stage. Some existing works have explored the relationship between adversarial robustness and DP model training Phan et al. (2020) ; Pinot et al. (2019) ; Tursynbek et al. (2020) . We utilize similar ideas of perturbation stability but for privacy. Shavit and Gjura Shavit & Gjura (2019) use Lipschitz neural networks Gouk et al. (2018) to learn a private mechanism design for summary statistics such as median, however their mechanism design lack privacy guarantee. Posthoc approach to privacy applies privacy preserving mechanism in a data dependent manner. Smooth sensitivity Nissim et al. (2007) and PTR Dwork & Lei (2009) reduce the noise magnitude since local sensitivity is only equal to global sensitivity in the worst case. Privacy odometer Rogers et al. ( 2016), Ex-post privacy loss Ligett et al. (2017) and Rényi privacy filter Feldman & Zrnic (2021) track privacy loss as the query is applied over data. Our works builds upon the PTR framework in order to give high privacy for less sensitive data. However, as we show in Sec 4, our framework reformulates the PTR algorithm to make it tractable under our setup.

B PROOFS

Lemma B.1. For a given f and for d Y ← ℓ α and dχ ← ℓ β , ∆ LS (x, R) ≤ L(f, N (x, R)). Proof. Local sensitivity (∆ LS ) for a sample x in a radius R for a query f is defined as: ∆ LS (x, R) = sup x ′ ∈N (x,R) d Y (f (x), f (x ′ )) dχ(x, x ′ ) Local Lipschitz constant (L) for a function f and a neighborhood N is defined as: L α,β (f, N ) = sup x ′ ,x ′′ ∈N ||f (x ′ ) -f (x ′′ )|| α ||x ′ -x ′′ || β (x ′ ̸ = x ′′ ) If L is defined around neighborhood N (x, R) then the set over which local sensitivity is computed is a subset of the set over which local Lipschitz constant is estimated. Intuitively, local Lipschitz condition is for all possible pair of samples in the neighborhood while local sensitivity is for all samples with respect to the given sample. Since both conditions require a suprememum over the set, ∆ LS (x, R) ≤ L(f, N (x, R)). Lemma B.2. Algorithm ϕ gives a lower bound on the query γ. That is, ϕ(x, R) ≤ γ(x, R). Proof. The γ query is defined as - γ(x, R) = min x ′ ∈ χ {dχ(x, x ′ ) s.t. ∆ LS (x ′ , R) > ∆ p LS }. The ϕ query is defined as - ϕ(x, R) = 1 2 • arg max R ′ ≥R {L(f, N (x, R ′ )) ≤ ∆ p LS } For any given sample x and privacy parameters (R, ∆ p LS ) such that s = ϕ(x, R), we know that ∀x ′ ∈ N (x, s) N (x ′ , s) ⊂ N (x, 2s) =⇒ L(f, N (x ′ , s)) ≤ L(f, N (x, 2s)) Based on eq 10, we know that L(f, N (x, 2s)) ≤ ∆ p LS and hence ∀x ′ ∈ N (x, s), L(f, N (x ′ , s)) ≤ ∆ p LS Therefore, ∆ LS (x ′ , R) ≤ ∆ p LS and hence, s ≤ γ(x) For the cases when there is not any feasible solution, ϕ returns 0 which is exactly the same answer for γ query. This completes the proof. Lemma B.3. The query ϕ(•) has a global sensitivity of 1, i.e. d abs (ϕ(x, R), ϕ(x ′ , R)) ≤ dχ(x, x ′ ) Proof. We will prove the above argument through a contradiction. We will prove that for a fixed radius R and any arbitrary point x ∈ χ , the neighborhood spanned by ϕ(x, R) can not be a proper superset for any neighborhood spanned by any other point ϕ(x ′ , R). More formally, we will prove, ∀x, x ′ ∈ χ , N (x, ϕ(x, R)) ̸ ⊂ N (x ′ , ϕ(x ′ , R)) and N (x ′ , ′ , R)) ̸ ⊂ N (x, ϕ(x, R)). Once proven, this argument allows us to specify the distance between x and x ′ with respect to ϕ(x, R) and ϕ(x ′ , R). Since the function ϕ(x, R) returns the maximum possible value such that L(f, N (x, ϕ(x, R))) ≤ ∆ p LS Therefore, for any ζ > 0 L(f, N (x, ϕ(x, R) + ζ)) > ∆ p LS (12) For contradiction, we assume that ∃x, x ′ ∈ χ s.t. N (x, ϕ(x, R)) ⊂ N (x ′ , ϕ(x ′ , R)) =⇒ ∃ η > 0 s.t. N (x, ϕ(x, R) + η) ⊆ N (x ′ , ϕ(x ′ , R)) (13) =⇒ L(N (x, ϕ(x, R) + η)) ≤ ∆ p LS (14) This leads to a contradiction between eq 12 and eq 14. Therefore, ∀x, x ′ ∈ χ , ϕ(x, R) ≤ ϕ(x ′ , R) + dχ(x, x ′ ) Using symmetry argument, we can show that d abs (ϕ(x, R), ϕ(x ′ , R)) ≤ dχ(x, x ′ ) This completes the proof.

C PROPOSE-TEST-RELEASE

DP mechanisms typically add noise based on the global sensitivity of a query. However, for several queries over various data distributions, average local sensitivity might be much lower than the global sensitivity. However, local sensitivity is data dependent hence the amount of noise introduced by a mechanism based on local sensitivity itself can reveal private information. Therefore, to add noise based on local sensitivity in a privacy preserving manner, Dwork & Lei (2009) introduced PTR. Conceptually, the idea behind PTR is to propose an arbitrary upper bound on the true value of local sensitivity. This upper bound should be obtained privately otherwise the choice of upper bound itself can reveal private information. To test whether the proposed bound is correct, the mechanism performs a privacy-preserving testing of the upper bound. The test itself is a randomized algorithm due to privacy requirements. Therefore, it can have false positives and false negatives. If the test fails, the mechanism returns ⊥ (no-answer). Otherwise, standard DP mechanism (ex. -Laplace) is applied to the query based on the proposed sensitivity (and not true local sensitivity). More formally, the algorithm proceeds in the following steps -1. A proposal on upper bound of a query q is fed as input for data x. Let us call it ∆ p LS . 2. The algorithm finds the closest point x ′ to x such that ∆ LS (x ′ ) > ∆ p LS . Here ∆ LS refers to local sensitivity for the query q. 3. Let γ = d H (x, x ′ ) and γ = γ + Lap(1/ϵ) 4. If γ ≤ ln(1/δ)/ϵ; return ⊥ 5. Else share data x + Lapl(∆ p LS /ϵ) Computational Cost: Depending on the query, this algorithm can incur significant computation cost. Especially in the Step 2, finding closest x ′ can be impractical if the data space is high dimensional. Finally, for queries such as neural networks evaluating local sensitivity itself is not practical since it requires giving exact and correct solution to a non-convex optimization problem. Therefore, our framework relies on computing local lipschitz constant over a small neighborhood instead of local sensitvity over the complete data space. Their key idea is to estimate the supremal norm of the jacobian of the neural network. Since ReLU networks do not allow for differentiability, LipMip uses clark jacobian to circumvent the issue and encode the optimization objective of obtaining local Lipschitz constant over a pre-defined neighborhood as a mixed integer programming problem. The neighborhood is specified as a hypercube with same dimension as points in the neighborhood. Their algorithm searches for feasible regions and minimizes the gap between lower and upper bound on the Lipschitz constant.

E EXPERIMENTAL DETAILS

Our experimental setup operates in three stages -i) Embedder training, ii) Obfuscator training, and iii) Private inference. Our codebase is available https://drive.google.com/drive/ folders/1DpHhS9u-Mpp3TVmTYiue7BKKUshyKw2w?usp=sharing here for reproducability. We will release the code and all trained models publicly after the reviews. For all our experiments we use PyTorch (Paszke et al. (2019) ) with Nvidia-GeForce GTX TITAN GPU. We use β-VAE with β = 5 for the design of the embedder.

1.. Embedder Training:

We use embedding dimension as 8 for MNIST and FMNIST dataset. For the UTKFace dataset, we use embedding size as 10. We use Adam optimizer (Kingma & Ba (2014) ) with a constant learning rate of 0.001. The VAE architecture for MNIST and



); Guo et al. (2021); Singh et al. (2021)) show that a proxy adversary's performance as a measure of protection could be unreliable. Some of the existing ARL techniques have used theoretical tools(Hamm (2017); Zhao et al. (2020b); Basciftci et al. (2016); Zhao et al. (2020a); Wang et al. (2017); Bertran et al. (2019); Mireshghallah et al. (

Figure 1: Posthoc Privacy framework:We project a high dimensional data instance to a lower dimensional embedding. The goal of the embedder is to measure a semantically relevant distance between different instances. The embedding is fed to the Obfuscator that compresses similar inputs in a small volume. In traditional ARL, the obfuscated instance is shared with the untrusted service provider without any formal privacy guarantee. In this work, by analyzing the stability of the obfuscator network, we perturb the obfuscated instance to provide a formal privacy guarantee.

);Osia et al. (2020);Vepakomma et al. (2021)) do not require a proxy adversary but still learn an obfuscator model using other regularizers. We propose to use an embedder (g(θ, •)) to learn semantic similarity using VAE(Kingma & Welling (2013)). The key idea of using the embedder is to embed the original sample (x) to a lower dimensional space (z = g(x)) such that the distance metric in z space captures semantic similarity as shown in Fig1and motivated in Sec 3. Since z can be (almost) loss-lessly inverted to x, it is fed to the obfuscator to get z = f (z).

5,3.6. To assess the level of indistinguishability, we look at Fig 3 where we project the original images into embedding space and sample points from the boundary of

Figure 2: Local sensitivity comparison for different values of α: The five bars for each α represent different neighborhood radii. Increase in the value of α decreases the local Lipschitz constant (upper bound on local sensitivity) indicating lesser amount of noise to be added for the same level of privacy.

Extended PTR algorithm for (ϵ, δ, R)-semantic neighborhood privacy Data:x ∈ χ Inputs: ϵ ∈ R + , δ ∈ R + , R ∈ R + , ∆ p LS ∈ R + Init: ζ ∈ R + ;/ * For numerical stability, typically very small * /Init: R min = R, R max ∈ R + while R max > R min + ζ do R mid = (R min + R max )/2; r = L(f, N (x, R)) ; / * Compute local Lipschitz constant * / if r < ∆ p LS then R min = R mid ; else R max = R mid ; end end r = Rmin 2 ; R ← r + Lap(1/ϵ); if R < ln(1/δ)/ϵ then return ⊥; else return f (z) + Lap(∆ p LS /ϵ); end D LIPSCHITZ CONSTANT ESTIMATION We use the mixed integer programming based algorithm LipMip by Jordan & Dimakis (2020) for computing the local Lipschitz constant. Their technique allows exactly computing the local Lipschitz constant of a neural network with ReLU non-linearities.

Performance comparison for different baselines: Our posthoc framework enables comparison between different obfuscation techniques by fixing the privacy budget (ϵ). First four rows are different approaches to protect against data reconstruction and the remaining rows below are combinations of different approaches. The top row refers to the accuracy corresponding to different datasets under two extremes of epsilons. ARL refers to widely used adversarial representation learning approach for regularizing representation based on a proxy attacker(Li et al. (2021);Liu et al. (2019);Xiao et al. (2020);Singh et al. (2021)). Contrastive refers to contrastive learning based informally privatizing mechanism introduced in Osia et al.

