ADVERSARIAL REPRESENTATION LEARNING FOR SYN-THETIC REPLACEMENT OF PRIVATE ATTRIBUTES

Abstract

Data privacy is an increasingly important aspect of many real-world big data analytics tasks. Data sources that contain sensitive information may have immense potential which could be unlocked using privacy enhancing transformations, but current methods often fail to produce convincing output. Furthermore, finding the right balance between privacy and utility is often a tricky trade-off. In this work, we propose a novel approach for data privatization, which involves two steps: in the first step, it removes the sensitive information, and in the second step, it replaces this information with an independent random sample. Our method builds on adversarial representation learning which ensures strong privacy by training the model to fool an increasingly strong adversary. While previous methods only aim at obfuscating the sensitive information, we find that adding new random information in its place strengthens the provided privacy and provides better utility at any given level of privacy. The result is an approach that can provide stronger privatization on image data, and yet be preserving both the domain and the utility of the inputs, entirely independent of the downstream task.

1. INTRODUCTION

Increasing capacity and performance of modern machine learning models lead to increasing amounts of data required for training them (Goodfellow et al., 2016) . However, collecting and using large datasets which may contain sensitive information about individuals is often impeded by increasingly strong privacy laws protecting individual rights, and the infeasibility of obtaining individual consent. Giving privacy guarantees on a dataset may let us share data, while protecting the rights of individuals, and thus unlocking the large benefits for individuals and for society that big datasets can provide. In this work, we propose a technique for selective obfuscation of image datasets. The aim is to provide the original data as detailed as possible while making it hard for an adversary to detect specific sensitive attributes. The proposed solution is agnostic to the downstream task, with the objective to make the data as private as possible given a distortion constraint. This issue has previously been addressed using adversarial representation learning with some success: a filter model is trained to obfuscate sensitive information while an adversary model is trained to recover the information (Edwards & Storkey, 2016) . In the current work, we demonstrate that it is easier to hide sensitive information if you replace it with something else: a sample which is independent from the input data. Aside from the adversary module, our proposed solution includes two main components: one filter model that is trained to remove the sensitive attribute, and one generator model that inserts a synthetically generated new value for the sensitive attribute. The generated sensitive attribute is entirely independent from the sensitive attribute in the original input image. Following a body of work in privacy-related adversarial learning we evaluate the proposed model on faces from the CelebA dataset (Liu et al., 2015) , and consider, for example, the smile or gender of a person to be the sensitive attribute. The smile is an attribute that carries interesting aspects in the transformations of a human face. The obvious change reside close to the mouth when a person smiles, but also other subtle changes occur: eyelids tighten, dimples show and the skin wrinkles. The current work includes a thorough analysis of the dataset, including correlations of such features. These correlations make the task interesting and challenging, reflecting the real difficulty that may occur when anonymizing data. What is the right trade-off between preserving the utility as defined by allowing information about other attributes to remain, and removing the sensitive information? In our setup, the adversary can make an arbitrary number of queries to the model. For each query another sample will be produced from the distribution of the sensitive data, while keeping as much as possible of the non-sensitive information about the requested data point.

2. RELATED WORK

Privacy-preserving machine learning has been studied from a number of different angles. Some work assumes access to a privacy-preserving mechanism, such as bounding boxes for faces, and studies how to hide people's identity by blurring (Oh et al., 2016a) , removing (Orekondy et al., 2018) or generating the face of other people (Hukkelås et al., 2019) in their place. Other work assumes access to the utility-preserving mechanism and proposes to obfuscate everything except what they want to retain (Alharbi et al., 2019) . This raises the question: how do we find the pixels in an image that need to be modified to preserve privacy with respect to some attribute? Furthermore, Oh et al. (2016b) showed that blurring or removing the head of a person has a limited effect on privacy. The finding is crucial; we cannot rely on modifications of an image such as blurring or overpainting to achieve privacy. An adversarial set-up instead captures the signals that the adversary uses, and can attain a stronger privacy. Adversarial learning is the process of training a model to fool an adversary (Goodfellow et al., 2014) . Both models are trained simultaneously, and become increasingly good at their respective task during training. This approach has been successfully used to learn image-to-image transformations (Isola et al., 2017; Choi et al., 2018) , and synthesis of properties such as facial expressions (Song et al., 2017; Tang et al., 2019) . Privacy-preserving adversarial representation learning utilize this paradigm to learn representations of data that hide sensitive information (Edwards & Storkey, 2016; Zhang et al., 2018; Xie et al., 2017; Beutel et al., 2017; Raval et al., 2017) . Bertran et al. (2019) minimize the mutual information between the utility variable and the input image data conditioned on the learned representation. Roy & Boddeti (2019) maximize the entropy of the discriminator output rather than minimizing the log likelihood, which is beneficial for stronger privacy. Osia et al. (2020) approached the problem using an information-bottleneck. Wu et al. (2018 ), Ren et al. (2018 ), and Wang et al. (2019) learn transformations of video that respect a privacy budget while maintaining performance on a downstream task. Tran et al. (2018) proposed an approach for pose-invariant face recognition. Similar to our work, their approach used adversarial learning to disentangle specific attributes in the data. Oh et al. (2017) trained a model to add a small amount of noise to the input to hide the identity of a person. Xiao et al. (2020) learn a representation from which it is hard to reconstruct the original input,but from which it is possible to predict a predefined task. The method provides control over which attributes that is preserved, but no control over which attributes that are being censored. That is, it puts more emphasis on preserving utility than privacy, which is not always desired. All of these, with the exception of Edwards & Storkey (2016) (see below), depend on knowing the downstream task labels. Our work has no such dependency: the data produced by our method is designed to be usable regardless of downstream task. In Edwards & Storkey (2016), a limited experiment is included which does not depend on the downstream task. In this experiment, they remove sensitive text which was overlaid on images, a task which is much simpler than the real-world problem considered in the current work. The overlaid text is independent of the underlying image, and therefore the solution does not require a trade-off between utility and privacy which is the case in most real settings. Furthermore, we also replace the sensitive information with synthetic information which we show further strengthens the privacy. Like in the current work, Huang et al. (2017 Huang et al. ( , 2018) ) use adversarial learning to minimize the mutual information between the private attribute and the censored image under a distortion constraint. Our solution extends and improves upon these ideas with a modular design consisting of a filter that is trained to obfuscate the data, and a generator that further enhances the privacy by adding new independently sampled synthetic information for the sensitive attributes.

