

Abstract

Cloud based machine learning inference is an emerging paradigm where users share their data with a service provider. Due to increased concerns over data privacy, recent works have proposed using Adversarial Representation Learning (ARL) to learn a privacy-preserving encoding of sensitive user data before it is shared with an untrusted service provider. Traditionally, the privacy of these encodings is evaluated empirically as they lack formal guarantees. In this work, we develop a new framework that provides formal privacy guarantees for an arbitrarily trained neural network by linking its local Lipschitz constant with its local sensitivity. To utilize local sensitivity for guaranteeing privacy, we extend the Propose-Test-Release (PTR) framework to make it tractable for neural network based queries. We verify the efficacy of our framework experimentally on real-world datasets and elucidate the role of ARL in improving the privacy-utility tradeoff.

1. INTRODUCTION

The ethical and regulatory concerns around data privacy have become increasingly important with the adoption of machine learning (ML) across various sectors such as health, finance, and mobility. Although training ML models privately has seen tremendous progress (Abadi et al. (2016) 2018)) address this challenge by performing computation over encrypted data. However, to combat the high computational cost of encryption techniques, alternative works have used ARL to suppress task irrelevant information from data. While ARL based techniques have shown promising empirical results, they lack formal privacy guarantees over obfuscated representations due to their use of Deep Neural Networks (DNNs) for achieving privacy. For the first time, we show how to give formal privacy guarantees for inference queries over arbitrarily trained (including ARL) DNNs. The key aspect of any ARL algorithm is an obfuscator which is trained to encode a user's private data such that an attacker can not recover the original data from its encoding. Achieving formal privacy guarantees for an obfuscator has remained elusive due to the non-convexity of the training objective of DNNs. In this work, we take a posthoc approach to guaranteeing privacy, where the privacy of data is evaluated after the obfuscator is learned. Because the obfuscator is trained for non-invertibility, we hypothesize that the obfuscator network should act as a contractive mapping, and hence, increase the stability of the function in its local neighborhood, i.e., reduce sensitivity. Therefore, we measure the stability of an adversarially learned obfuscator neural network, using Lipschitz constants, and link it with privacy properties. To exactly compute the local Lipschitz constant of a non-linear (ReLU) DNNs, we use LipMip(Jordan & Dimakis (2020)), a mixed-integer programming based technique, and re-formulate the ARL pipeline to ensure the computational feasibility of calculating the Lipschitz constant. To draw a connection between the local Lipschitz constant and reconstruction privacy, we introduce a privacy definition that is a specific instantiation of a general dχ-privacy framework by Chatzikokolakis et al. (2013) . Instead of evaluating the global Lipschitz constant of DNNs, we evaluate the Lipschitz constant only in the local neighborhood of the user's sensitive data. We extend the Propose-Test-Release (PTR) (Dwork & Lei (2009) ) framework to formalize our local neighborhood based measurement of the Lipschitz constant. The scope of our paper is to provide privacy guarantees against reconstruction attacks for existing ARL techniques, i.e., our goal is not to develop a new ARL technique but rather to develop a formal privacy framework compatible with existing ARL techniques. A Majority of the ARL techniques protect either a sensitive attribute or reconstruction of the input. We only consider sensitive input in this work. We adopt a different threat model from that of traditional differential privacy (DP) (Dwork et al. (2014) ) because, as we explain later, protecting membership inference is at odds with private inference. Our threat model for the reconstruction attack is motivated by the use cases where a user may be willing to disclose coarse-grained information about their data but wants to prevent leakage of fine-grained information. Alternate threat models have been widely used in the privacy literature(Chatzikokolakis et al. ( 2013 2021)). Furthermore, we only focus on protecting the privacy of data during the inference stage, and assume that ML models can be trained privately. Typically ARL techniques evaluate the privacy of their representations by empirically measuring the information leakage using a proxy adversary. Existing works(Srivastava et al. ( 2019 2021)) for measuring information leakage empirically. However, most of these works analyze specific obfuscation techniques and lack formal privacy definitions. In contrast, our work is agnostic to the design of the obfuscator as long as it is differentiable, and our definition is built upon a variant of DP -a widely used formal privacy framework. Our privacy definition and mechanism is built upon dχ-privacy(Chatzikokolakis et al. ( 2013)) and PTR (Dwork & Lei (2009) ). Existing instantiations of dχ-privacy include geo-indistinguishability(Andrés et al. ( 2013)) and location-dependent privacy(Koufogiannis & Pappas ( 2016)) that share a similar goal as ours of sharing coarse grained information. Our work differs in its usage of neural network queries and high dimensional data modality. We refer the reader to Appendix Sec A for a detailed literature review. In Sec 2 we begin with the preliminaries of DP and its variant for metric spaces. Then, we motivate our ML inference setup and introduce our privacy definition in Sec 3. Next, we construct our posthoc framework by extending PTR and proving its privacy guarantees in Sec 4. In Sec 5 we experimentally demonstrate the feasibility of our framework and understand the dynamics of ARL algorithms. Our contributions can be summarized as follows: • We introduce (ϵ, δ, R)-neighborhood privacy definition to formalize reconstruction privacy for ARL based inference. • We extend the PTR framework to make it tractable for neural network based queries. Our extension bridges the gap between formal privacy frameworks and empirical techniques in private ML inference. • We perform extensive experimental analysis on ARL techniques and provide insight into how ARL improves the privacy-utility tradeoff by reducing the local sensitivity of DNNs. 2011)), this model has been extended such that each user shares M(x), and the service provider is untrusted. Our threat model is a special case of local DP which we refer to as single-instance sharing. In this setup, the client queries every data instance independently with the service provider and there is no aggregation or summary statistic involved. For ex.-a user shares a face image to receive an age prediction from the service provider. While our setup



; Papernot et al. (2016); Du et al. (2021); Jordon et al. (2018)) in the last few years, protecting privacy during the inference phase remains a challenge as these models get deployed by cloud based service providers. Cryptographic techniques(Ohrimenko et al. (2016); Knott et al. (2021); Mishra et al. (2020); Juvekar et al. (

); Kifer & Machanavajjhala (2014); Andrés et al. (2013); Hannun et al. (

); Guo et al. (2021); Singh et al. (2021)) show that a proxy adversary's performance as a measure of protection could be unreliable. Some of the existing ARL techniques have used theoretical tools(Hamm (2017); Zhao et al. (2020b); Basciftci et al. (2016); Zhao et al. (2020a); Wang et al. (2017); Bertran et al. (2019); Mireshghallah et al. (

DP)(Dwork et al. (2014)) is a widely used framework for answering a query, f , on a dataset x ∈ χ by applying a mechanism M(•) such that the probability distribution of the output of the mechanism M(x) is similar regardless the presence or absence of any individual in the dataset x. More formally, M satisfies (ϵ, δ)-DP if ∀x, x ′ ∈ χ such that d H (x, x ′ ) ≤ 1, and for all (measurable) output S over the range of M P(M(x) ∈ S) ≤ e ϵ P(M(x ′ ) ∈ S) + δ, where d H is the hamming distance. This definition is based on a trusted central server model, where a trusted third party collects sensitive data and computes M(x) to share with untrusted parties. In local-DP(Kasiviswanathan et al. (

