THERE IS NO TRADE-OFF: ENFORCING FAIRNESS CAN IMPROVE ACCURACY

Abstract

One of the main barriers to the broader adoption of algorithmic fairness in machine learning is the trade-off between fairness and performance of ML models: many practitioners are unwilling to sacrifice the performance of their ML model for fairness. In this paper, we show that this trade-off may not be necessary. If the algorithmic biases in an ML model are due to sampling biases in the training data, then enforcing algorithmic fairness may improve the performance of the ML model on unbiased test data. We study conditions under which enforcing algorithmic fairness helps practitioners learn the Bayes decision rule for (unbiased) test data from biased training data. We also demonstrate the practical implications of our theoretical results in real-world ML tasks.

1. INTRODUCTION

Machine learning (ML) models are routinely used to make or support consequential decisions in hiring, lending, sales etc. (Citron and Pasquale, 2014) . This proliferation of ML models in decision making and decision support roles has led to concerns that ML models may inherit (or even exacerbate) social biases in the training data. For example, Pro-Publica's investigation of Northpointe (now Equivant)'s COMPAS recidivism prediction tool revealed racial biases against African-Americans (Angwin et al., 2016) . In response, the ML community has developed many rigorous definitions of algorithmic fairness, including calibration (Corbett-Davies and Goel, 2018), (statistical) parity (Feldman et al., 2014 ), equalized odds (Hardt et al., 2016) , and individual fairness (Dwork et al., 2011) . Researchers have also designed many algorithms for enforcing the definitions during training (Agarwal et al., 2018; Cotter et al., 2019; Yurochkin et al., 2020) . Despite this flurry of work, algorithmic fairness practices remain uncommon in production. We conjecture that the lack of broader adoption of algorithmic fairness practices is because there seems to be a trade-off between accuracy and fairness. Many algorithms that enforce fairness solve optimization problems that maximize how well the model fits the training data subject to fairness constraints. The trade-off arises because imposing fairness constraints usually leads to a model that fits the training data less well (compared to a model from maximizing goodness-of-fit without any extra constraints). In practice, this trade-off may not be relevant because the training data may be biased. For example, a resume screening model may reject most female applicants for technical roles because women are historically underrepresented in STEM fields, so women are underrepresented in the training data. This is a form of sampling bias, and it causes the model to perform poorly at test time because women are better represented in STEM fields today. In this example, the trade-off is irrelevant because we are mostly concerned with out-of-distribution (OOD) performance of the model. There are many other examples of algorithmic bias arising due to biases in the training data. As another example, the systemic racism in the US criminal justice system disproportionately affects African-Americans, leading to higher rates of arrest, conviction, and incarceration. It is no surprise that recidivism prediction instruments trained on such biased data is biased against African-Americans (Angwin et al., 2016) . In 2014, then U.S. Attorney General Eric Holder warned that recidivism prediction instruments "may exacerbate unwarranted and unjust disparities that are already far too common in our criminal justice system and in our society".

