FAIR FEDERATED LEARNING VIA BOUNDED GROUP LOSS

Abstract

Fair prediction across protected groups is an important constraint for many federated learning applications. However, prior work studying group fair federated learning lacks formal convergence or fairness guarantees. In this work we propose a general framework for provably fair federated learning. In particular, we explore and extend the notion of Bounded Group Loss as a theoretically-grounded approach for group fairness. Using this setup, we propose a scalable federated optimization method that optimizes the empirical risk under a number of group fairness constraints. We provide convergence guarantees for the method as well as fairness guarantees for the resulting solution. Empirically, we evaluate our method across common benchmarks from fair ML and federated learning, showing that it can provide both fairer and more accurate predictions than baseline approaches.

1. INTRODUCTION

Group fairness aims to mitigate unfair biases against certain protected demographic groups (e.g. race, gender, age) in the use of machine learning. Many methods have been proposed to incorporate group fairness constraints in centralized settings (e.g., Agarwal et al., 2018; Feldman et al., 2015; Hardt et al., 2016; Zafar et al., 2017a) . However, there is a lack of work studying these approaches in the context of federated learning (FL), a training paradigm where a model is fit to data generated by a set of disparate data silos, such as a network of remote devices or collection of organizations (Kairouz et al., 2019; Li et al., 2020; McMahan et al., 2017) . Mirroring concerns around fairness in non-federated settings, many FL applications similarly require performing fair prediction across protected groups. Unfortunately, as we show in Figure 1 , naively applying existing approaches to each client in a federated network in isolation may be inaccurate due to heterogeneity across clients-failing to produce a fair model across the entire population (Zeng et al., 2021) . Several recent works have considered addressing this issue by exploring specific forms of group fairness in FL (e.g., Chu et al., 2021; Cui et al., 2021; Du et al., 2021; Papadaki et al., 2022; Rodríguez-Gálvez et al., 2021; Zeng et al., 2021) . Despite promising empirical performance, these prior works lack formal guarantees surrounding the resulting fairness of the solutions (Section 2), which is problematic as it is unclear how the methods may perform in real-world FL deployments. In this work we provide a formulation and method for group fair FL that can provably satisfy global fairness constraints. Common group fairness notions that aim to achieve equal prediction quality between any two protected groups (e.g., Demographic Parity, Equal Opportunity (Hardt et al., 2016) ) are difficult to provably satisfy while simultaneously finding a model with high utility. Instead, we consider a different fairness notion known as Bounded Group Loss (BGL) (Agarwal et al., 2019) , which aims to promote worst group's performance, to capture these common group fairness criteria. As we show, a benefit of this approach is that in addition to having practical advantages in terms of fairness-utility trade-offs (Section 5), it maintains smoothness and convexity properties that can equip our solver with favorable theoretical guarantees. Based on our group fairness formulation, we then provide a scalable method (PFFL) to solve the proposed objectives via federated saddle point optimization. Theoretically, we provide convergence guarantees for the method as well as fairness and generalization guarantees for the resulting solutions. Empirically, we demonstrate the effectiveness of our approach on common benchmarks from fair machine learning and federated learning. We summarize our main contributions below: • We propose a novel fair federated learning framework for a range of group fairness notions. Our framework models the fair FL problem as a saddle point optimization problem and leverages variations of Bounded Group Loss (Agarwal et al., 2019) to capture common forms of group fairness. We also extend BGL to consider a new fairness notion called Conditional Bounded Group Loss (CBGL), which may be of independent interest and utility in non-federated settings. • We propose a scalable federated optimization method for our group fair FL framework. We provide a regret bound analysis for our method under convex ML objectives to demonstrate formal convergence guarantees. Further, we provide fairness and generalization guarantees on the model for a variety of fairness notions. • Finally, we evaluate our method on common benchmarks used in fair machine learning and federated learning. In all settings, we find that our method can significantly improve model fairness compared to baselines without sacrificing model accuracy. Additionally, even though we do not directly optimize classical group fairness constraints (e.g., Demographic Parity, Equal Opportunity), we find that our method can still provide comparable/better fairness-utility trade-offs relative to existing approaches when evaluated on these metrics.

2. BACKGROUND AND RELATED WORK

Fair Machine Learning. Algorithmic fairness in machine learning aims to identify and correct bias in the learning process. Common approaches for obtaining fairness include pre-processing methods that rectify the features or raw data to enhance fairness (Calmon et al., 2017; Feldman et al., 2015; Zemel et al., 2013) ; post-processing methods that revise the prediction score for a trained model (Dwork et al., 2018; Hardt et al., 2016; Menon & Williamson, 2018) ; and in-processing methods that directly modify the training objective/solver to produce a fair predictor (Agarwal et al., 2018; 2019; Woodworth et al., 2017; Zafar et al., 2017a; b) . Most existing methods in fair ML rely on using a centralized dataset to train and evaluate the model. As shown in Figure 1 , in the federated setting where data is privately distributed across different data silos, directly applying these methods locally only ensures fairness for each silo rather than the entire population. Developing effective and efficient techniques for fair FL is thus an important area of study. Fair Federated Learning. In FL, definitions of fairness may take many forms. A commonly studied notion of fairness is representation parity (Hashimoto et al., 2018) , whose application in FL requires the model's performance across all clients to have small variance (Donahue & Kleinberg, 2021; Li et al., 2019a; 2021; Mohri et al., 2019; Yue et al., 2021) . In this work we instead focus on notions of group fairness, in which every data point in the federated network belongs to some (possibly) protected group, and we aim to find a model that doesn't introduce bias towards any group. Recent works have proposed various objectives for group fairness in federated learning. Zeng et al. (2021) proposes a bi-level optimization objective that minimizes the difference between each group's loss while finding an optimal global model. Similarly, several works propose using a constrained optimization problem that aims to find the best model subject to an upper bound on the group loss difference (Chu et al., 2021; Cui et al., 2021; Du et al., 2021; Rodríguez-Gálvez et al., 2021) . Different from these approaches, our method focuses on a fairness constraint based on upperbounding the loss of each group with a constant rather than the loss difference between any two groups. More closely related to our work, Papadaki et al. ( 2022) weighs the empirical loss given each group by a trainable vector λ and finds the best model for the worst case λ. Though similar to our method for ζ = 0, this approach fails to achieve both strong utility and fairness performance under non-convex loss functions (see Section 5). Zhang et al. (2021) also propose a similar objective to learn a model with unified fairness. Among these works, Zeng et al. ( 2021) and Cui et al. ( 2021) also provide simplified convergence and fairness guarantees for their method. However, these works lack formal analyses around the convergence for arbitrary convex loss functions as well as the behavior of the fairness constraint over the true data distribution. Ours is the first work we are aware to provide such guarantees in the context of group fair federated learning.



Figure 1: Naively applying fair learning method locally at each client might be problematic. Left: Due to data heterogeneity in FL, data distributions conditioned on each protected attribute (shown in different colors) may differ across clients. Fair FL aims to learn a model that provides fair prediction on the entire data distribution. Right: Empirical results (ACS dataset) verify that training with local fairness constraints alone induces higher error and worse fairness than using a global fairness constraint.

