UNFAIR GEOMETRIES: EXACTLY SOLVABLE DATA MODEL WITH FAIRNESS IMPLICATIONS

Abstract

Machine learning (ML) may be oblivious to human bias but it is not immune to its perpetuation. Marginalisation and iniquitous group representation are often traceable in the very data used for training, and may be reflected or even enhanced by the learning models. In the present work, we aim at clarifying the role played by data geometry in the emergence of ML bias. We introduce an exactly solvable high-dimensional model of data imbalance, where parametric control over the many bias-inducing factors allows for an extensive exploration of the bias inheritance mechanism. Through the tools of statistical physics, we analytically characterise the typical properties of learning models trained in this synthetic framework and obtain exact predictions for the observables that are commonly employed for fairness assessment. Despite the simplicity of the data model, we retrace and unpack typical unfairness behaviour observed on real-world datasets. We also obtain a detailed analytical characterisation of a class of bias mitigation strategies. We first consider a basic loss-reweighing scheme, which allows for an implicit minimisation of different unfairness metrics, and quantify the incompatibilities between some existing fairness criteria. Then, we consider a novel mitigation strategy based on a matched inference approach, consisting in the introduction of coupled learning models. Our theoretical analysis of this approach shows that the coupled strategy can strike superior fairness-accuracy trade-offs.

1. INTRODUCTION

Machine Learning (ML) systems are actively being integrated in multiple aspects of our lives, from face recognition systems on our phones, to applications in the fashion industry, to high stake scenarios like healthcare. Together with the advantages of automatising these processes, however, we must also face the consequences of their -often hidden -failures. Recent studies Buolamwini & Gebru (2018); Weidinger et al. (2021) have shown that these systems may have significant disparity in failure rates across the multiple sub-populations targeted in the application. ML systems appear to perpetuate discriminatory behaviours that align with those present in our society Benjamin (2019); Noble (2018); Eubanks (2018); Broussard (2018). Discrimination over marginalised groups could originate at many levels in the ML pipeline, from the very problem definition, to data collection, to the training and deployment of the ML algorithm Suresh & Guttag (2021). Data represents a critical source of bias Perez (2019). In some cases, the dataset can contain a record of a history of discriminatory behaviour, causing complex dependencies that are hardly eradicated even when the explicit discriminatory attribute is removed. In other cases (or even concurrently), the root of the discrimination can be found in the data collection process, and is related to the structural properties of the dataset. Heterogeneous representations of different sub-populations typically induce major bias in the ML predictions. Drug testing provides a historically significant example: substantial evidence Hughes (2007); Perez (2019) shows that the scarcity of data points corresponding to women individuals in drug-efficiency studies resulted in a larger number of side effects in their group. In spite of a vast empirical literature, a large gap remains in the theoretical understanding of the bias-induction mechanism. A better theoretical grasp of this issue could help raise awareness and design more theoretically grounded and effective solutions. In this work, we aim to address this gap by introducing a novel synthetic data model, offering a controlled setting where data imbalances and the emergence of bias become more transparent and can be better understood.

