AUDITING FAIRNESS ONLINE THROUGH INTERACTIVE REFINEMENT

Abstract

Machine learning algorithms are increasingly being deployed for high-stakes scenarios. A sizeable proportion of currently deployed models make their decisions in a black box manner. Such decision-making procedures are susceptible to intrinsic biases, which has led to a call for accountability in deployed decision systems. In this work, we focus on user-specified accountability of decision-making processes of black box systems. Previous work has formulated this problem as run time fairness monitoring over decision functions. However, formulating appropriate specifications for situation-appropriate fairness metrics is challenging. We construct AVOIR, an automated inference-based optimization system that improves bounds for and generalizes prior work across a wide range of fairness metrics. AVOIR offers an interactive and iterative process for exploring fairness violations aligned with governance and regulatory requirements. Our bounds improve over previous probabilistic guarantees for such fairness grammars in online settings. We also construct a novel visualization mechanism that can be used to investigate the context of reported fairness violations and guide users towards meaningful and compliant fairness specifications. We then conduct case studies with fairness metrics on three different datasets and demonstrate how the visualization and improved optimization can detect fairness violations more efficiently and ameliorate the issues with faulty fairness metric design.

1. INTRODUCTION

The use of advanced analytics and artificial intelligence (AI), along with its many benefits, poses important threats to individuals and broader society at large. Hirsch et al. (2020) identify: invasion of privacy; manipulation of vulnerabilities; bias against protected classes; increased power imbalances; error; opacity and procedural unfairness; displacement of labor; pressure to conform, and intentional and harmful use as some of the key areas of concern. A core part of the solution to mitigate such risks is the need to make organizations accountable and ensure that the data they leverage and the models they build and use are both inclusive of marginalized groups and resilient against societal bias. Deployed AI and analytic systems are complex multi-step processes that can produce several sources of risk at each step. At each of these stages, determining accountability in the decision making in AI processes requires a determination of who is accountable, for what, to whom, and under what circumstances (Nissenbaum, 1996; Cooper et al., 2022) . A more comprehensive overview of Contextualizing wrt Nissenbaum the mechanisms that can support accountability with respect to the different stages of design of a machine learning system ca be found in the work of Cooper et al. (2022) . We center our analysis on the sub problem of auditing barriers towards investigating claims surrounding mathematical guarantees of automated decision making processes. Governments across the world are wrestling with the implementation of auditing regulation and practices for increasing the accountability of decision processes. Recent examples include the New York City auditing requirements for AI hiring tools (Vanderford, 2022) , European data regulation (GDPR 2018), accountability bills 2019; 2021 and judicial reports 2018. These societal forces have led to the emergence of checklists (Mitchell et al., 2019; Sokol & Flach, 2020) , metrics of fairness (Verma & Rubin, 2018) , and recently, algorithms and systems that observe and audits the behavior of AI algorithms. Such ideas date back to the 1950s (Moore, 1956) but research has largely been sporadic until very recently with the widespread use of AI-based decision making giving rise to the vision of algorithmic auditing (Galdon Clavell et al., 2020) . We present a framework for Auditing and Verifying fairness Online through Interactive Refinement (AVOIR) 1 . AVOIR builds upon the ideas on distributional probabilistic fairness guarantees (Albarghouthi & Vinitsky, 2019; Bastani et al., 2019) , generalizing them to real-world data. An overview of AVOIR is provided in Figure 1 . 1.1 PRELIMINARIES Machine learning testing (Zhang et al., 2020 ) is an avenue that can be used to expose undesired behavior and improve the trustworthiness of machine learning systems. Fairness criteria quantify the relationship between the outcome metric across multiple subgroups or similar individuals among the population. Formal definitions of fairness focus on observational criteria, i.e., those that can be written down as a probability statement involving the joint distribution of the features, sensitive attributes, decision making function, and actual outcome. Our framework, AVOIR, supports implementing a large range of group fairness criteria, including demographic parity (Calders et al., 2009 ), equal opportunity (Hardt et al., 2016 ), disparate mistreatment (Zafar et al., 2017) , and various combinations of these criteria. As an example, suppose r ∈ {0, 1} denotes the return value of a binary decision function (say, candidate selection for a job), and s is an indicator denoting whether a candidate belongs to a minority population. The 80%-rule for disparate impact (EEOC, 1979; Feldman et al., 2015) is a fairness criterion which states that Pr[r = 1|s] Pr[r = 1|¬s] ≥ 0.8 When implemented in the AVOIR DSL grammar, the above 80%-rule would be the specification Section to introduce the terms. E[r |S==s] / E[r |S!= s] >= 0.8. Here, the term E[r |S!=s ]/ E[r |S == s] is a subexpression of the specification. The smallest units involving an expectation (eg., E[r |S!=s]) are denoted as an elementary subexpressions. Our algorithm works by using adaptive concentration sets (Zhao et al., 2016; Howard et al., 2021) to build estimates for elementary subexpressions, and then deriving the estimates for expressions that combine them. We aim to derive statistical guarantees about fairness criteria based on estimates from observed outputs. For example, let X be an observed Bernoulli r.v 2 , then an assertion ϕ X = (E[X], ε, δ) over X, corresponds to an estimate satisfying ϕ X ≡ Pr[|E[X] -E[X]| ≥ ε] ≤ δ (1) where E[X] denotes an empirical estimate of E[X]. We then use assertions ϕ X , ϕ Y to assert claims for expressions involving X, Y . For example, for the 80%-rule, assertions over X/Y . A specification involves either a comparison of expressions with constants (eg., X/Y > 0.8), or a combination of multiple such comparisons. Such a specification may be True (T ) or False (F ) with some probability. For a given specification ψ, we denote the claim that P [ψ = F ] ≥ 1 -δ as ψ : (F, δ), where δ denotes the failure probability of a guarantee. Given a stream of (observations, outcomes from the decision functions), and a specified threshold probability δ, we will continue to refine the estimate for a given specification until we reach the threshold. We focus on fairness criteria that can be expressed using Bernoulli r.v. as it allows the simplification of probabilities into expectation, as Pr[r = 1] = E[r]. Specifications involving variables that take more than two values can be implemented using transformations and boolean operators (examples in Appendix H).



Figure 1: Shaded nodes describe our contributions in the AVOIR framework.

