COMPUTATIONAL-UNIDENTIFIABILITY IN REPRESEN-TATION FOR FAIR DOWNSTREAM TASKS

Abstract

Deep representation learning methods are highlighted as they outperform classical algorithms in various downstream tasks, such as classification, clustering, generative models, etc. Due to their success and impact on the real world, fairness concern is rising with noticeable attention. However, the focus of the fairness problem was limited to a certain downstream task, mostly classification. We claim that the fairness problems to various downstream tasks originated from the input feature space, i.e., the learned representation space. While several studies explored fair representation for the classification task, the fair representation learning method for unsupervised learning is not actively discussed yet. To fill this gap, we define a new notion of fairness, computational-unidentifiability, which suggests the fairness of the representation as the distributional independence of the sensitive groups. We demonstrate motivating problems that achieving computationally-unidentifiable representation is critical for fair downstream tasks. Moreover, we propose a novel fairness metric, Fair Fréchet distance (FFD), to quantify the computationalunidentifiability and address the limitation of a well-known fairness metric for unsupervised learning, i.e., balance. The proposed metric is efficient in computation and preserves theoretical properties. We empirically validate the effectiveness of the computationally-unidentifiable representations in various downstream tasks.

1. INTRODUCTION

Thanks to the outstanding performance and development of deep learning, it has been widely applied to various domains, including natural language processing (NLP) (Devlin et al., 2018 ), computer vision (Karras et al., 2019) , and generative models (Goodfellow et al., 2014) . On the other hand, the reliability and fairness concerns (Lee & Floridi, 2020; Angwin et al., 2016; Dastin, 2018) advanced due to their impact on the real world applications. Such fairness concerns include credit limit estimation (Vigdor, 2019) , job application filtering (Dastin, 2018) , or crime prevention (Dressel & Farid, 2018) , etc. Accordingly, algorithmic fairness is getting growing attention to prevent biased predictions. Following the mainstream fairness literature, we here focus on group fairness (Dua & Graff, 2019; Zafar et al., 2015; Hardt et al., 2016) , which suggests the equality of certain statistical measures (e.g., true positive rate, positive prediction) between subgroups with different protected attribute (e.g., gender, race, religion, etc). It has been widely studied to mitigate fairness violations in downstream tasks. Numerous studies (Hardt et al., 2016; Choi et al., 2020; Pleiss et al., 2017; Madras et al., 2018) explore how to attain group fairness in classification tasks. The primary objective of this family of works is to obtain the prediction independence of a protected property. Hardt et al. (2016) suggest equal opportunity, which requires the same true positive rates for the subgroup. Calibration among the subgroups (Kleinberg et al., 2016) is to match the predicted probability and actual distribution of favorable class. Moreover, some works (Kim et al., 2020; Jang et al., 2021) study efficient multi-constraint optimization to satisfy multiple fairness notions. However, most of the works mainly focus on the supervised setting. Even though deep learning has significant success in various unsupervised learning tasks, such as clustering (Xie et al., 2016; Guo et al., 2017 ), generative model (Karras et al., 2019; Radford et al., 2019), and NLP (Hadifar et al., 2019) , the fairness of unsupervised learning is relatively not actively studied (Buet-Golfouse & Utyagulov, 2022), and how to quantify the fairness of unsupervised learning methods has not been

