A TECHNICAL AND NORMATIVE INVESTIGATION OF SOCIAL BIAS AMPLIFICATION Anonymous authors Paper under double-blind review

Abstract

The conversation around the fairness of machine learning models is growing and evolving. In this work, we focus on the issue of bias amplification: the tendency of models trained from data containing social biases to further amplify these biases. This problem is brought about by the algorithm, on top of the level of bias already present in the data. We make two main contributions regarding its measurement. First, building off of Zhao et al. ( 2017), we introduce and analyze a new, decoupled metric for measuring bias amplification, BiasAmp → , which possesses a number of attractive properties, including the ability to pinpoint the cause of bias amplification. Second, we thoroughly analyze and discuss the normative implications of this metric. We provide suggestions about its measurement by cautioning against predicting sensitive attributes, encouraging the use of confidence intervals due to fluctuations in the fairness of models across runs, and discussing what bias amplification means in the context of domains where labels either don't exist at test time or correspond to uncertain future events. Throughout this paper, we work to provide a deeply interrogative look at the technical measurement of bias amplification, guided by our normative ideas of what we want it to encompass.

1. INTRODUCTION

The machine learning community is becoming increasingly cognizant of problems surrounding fairness and bias, and correspondingly a plethora of new algorithms and metrics are being proposed (see e.g., Mehrabi et al. (2019) for a review). The gatekeepers checking the systems to be deployed often take the form of fairness evaluation metrics, and it is vital that these be deeply investigated both technically and normatively. In this paper, we endeavor to do this for bias amplification. Bias amplification happens when a model exacerbates biases from the training data at test time. It is the result of the algorithm (Foulds et al., 2018) , and unlike other forms of bias, cannot be solely attributed to the dataset. To this end, we propose a new way of measuring bias amplification, BiasAmp →foot_0 , that builds off a prior metric from Men Also Like Shopping (Zhao et al., 2017) , that we will call BiasAmp MALS . Our metric's technical composition aligns with the real-world qualities we want it to encompass, addressing a number of the previous metric's shortcomings by being able to: 1) generalize beyond binary attributes, 2) take into account the base rates that people of each attribute appear, and 3) disentangle the directions of amplification. Concretely, consider a visual dataset (Fig. 1 ) where each image has a label for the task, T , which is painting or not painting, and further is associated with a protected attribute, A, which is woman or man. If the gender of the person biases the prediction of the task, we consider this A → T bias amplification; if the reverse happens, then T → A. In our normative discussion, we discuss a few topics. We consider whether predicting protected attributes is necessary in the first place; by not doing so, we can trivially remove T → A amplification. We also encourage the use of confidence intervals when using our metric because BiasAmp → , along with other fairness metrics, suffers from the Rashomon Effect (Breiman, 2001) , or multiplicity of good models. In deep neural networks, random seeds have relatively little impact on accuracy; however, that is not the case for fairness, which is more brittle to randomness.



The arrow in BiasAmp → is meant to signify the direction that bias amplification is flowing, and not intended to be a claim about causality.

