BIAS PROPAGATION IN FEDERATED LEARNING

Abstract

We show that participating in federated learning can be detrimental to group fairness. In fact, the bias of a few parties against under-represented groups (identified by sensitive attributes such as gender or race) can propagate through the network to all the parties in the network. We analyze and explain bias propagation in federated learning on naturally partitioned real-world datasets. Our analysis reveals that biased parties unintentionally yet stealthily encode their bias in a small number of model parameters, and throughout the training, they steadily increase the dependence of the global model on sensitive attributes. What is important to highlight is that the experienced bias in federated learning is higher than what parties would otherwise encounter in centralized training with a model trained on the union of all their data. This indicates that the bias is due to the algorithm. Our work calls for auditing group fairness in federated learning and designing learning algorithms that are robust to bias propagation.

1. INTRODUCTION

Machine learning models can exhibit bias against demographic groups. Previous research has extensively studied how machine learning algorithms can reflect and amplify bias in training data, especially in centralized settings where data is held by a single party (Hardt et al., 2016; Dwork et al., 2012; Calders et al., 2009; Hashimoto et al., 2018; Zhang et al., 2020; Blum and Stangl, 2020; Lakkaraju et al., 2017) . However, in practice, data is commonly owned by multiple parties and cannot be shared due to privacy concerns. Federated learning (FL) provides a promising solution by enabling parties to collaboratively learn a global model without sharing their data. In each round of FL, parties share their local model updates computed on their private datasets with a global server that aggregates them to update the global model. Despite the widespread adoption of FL in various applications such as healthcare, recruitment, and loan evaluation (Rieke et al., 2020; Yang et al., 2019) , it is not yet fully understood how FL algorithms could magnify bias in training datasets. Recent studies have investigated the problem of measuring and mitigating bias in federated learning with respect to a single global distribution (Chu et al., 2021; Zeng et al., 2021a; Hu et al., 2022; Du et al., 2021; Abay et al., 2020; Papadaki et al., 2021; 2022; Hu et al., 2022) . However, in practice, parties often have heterogeneous data distributions. Evaluating the model's bias with respect to the global distribution does not accurately reflect the fairness of the FL model with respect to parties' local data distributions, which are relevant to end-users. This is the critical problem that we address in this paper. Specifically, we investigate the following questions: How does participating in FL affect the bias and fairness of the resulting models compared to models which are trained in a standalone setting? Does FL provide parties with the potential fairness benefits of centralized training on the union of their data? Can parties with biased datasets negatively impact the experienced fairness of other parties on their local distributions? How and why does the bias of a small number of parties affect the entire network? To the best of our knowledge, we provide the first comprehensive analysis of how FL algorithms impact local fairness. We provide an empirical analysis based on real-world datasets. We show that FL might not sustain the benefits of collaboration in terms of fairness, as compared to its accuracy benefit. Specifically, compared with the standalone models, we find that the model trained in a centralized setting can be, on average, fairer on local data distributions. However, in those cases, the FL models, trained on the same dataset, can be more biased. This suggests that the FL algorithm itself can introduce new bias in the final model. Furthermore, we demonstrate that FL impacts different parties in different ways. Specifically, we find a strong correlation between parties' fairness gap in the standalone setting and the fairness benefit they obtain from joining FL: parties with a greater bias in the standalone setting (caused by their local data) would receive a fairer model from FL. In contrast, FL has negative impacts on parties with a less significant bias in the standalone setting, resulting in a more biased model in FL. We further demonstrate that this is due to the fact that FL propagates bias: bias from a few parties can influence all parties in the network, hence aggravating the fairness problem globally. Finally, we offer potential explanations for how bias is propagated in FL. Specifically, we show that local updates from biased parties increase the dependency between the model's predictions and the sensitive attributes. Such an increase is achieved by the norm increase in a small number of parameters (around 6% of the model parameters in some experiments). This increase then propagates to the global model through aggregation, subsequently impacting all other parties. In addition, we show that the fairness gap of the final model can be governed by adjusting the value of those parameters. Surprisingly, we find that scaling these parameters can either reduce the model's bias to a small value of 0.05 (on a measurement scale of 0 to 1) with only a 0.6% drop in accuracy or increase the bias to a large value of 0.96 with an 11% drop in accuracy.

2. BACKGROUND

Federated Learning. In this paper, we consider the conventional federated learning (FL) setting. (McMahan et al., 2017) . The FL framework consists of a network of K parties, where each party k ∈ [K] holds a local dataset Dk of size n k , sampled from a local data distribution D k . The objective of each party is to train a model that minimizes the loss on their local data distribution D k . To achieve this goal, FL trains a global model to minimize the average loss across parties, which is expressed as min θ 1 N K k=1 n k L(θ, Dk ), where N is the sum of all local training dataset size. In each communication round t, a global server sends the current global model to all parties. All parties train the global model locally on their local dataset and send the updated local model to the server. The server then aggregates those local models to obtain the new global model. We consider the case where all parties participate in the training in each round, which helps to mitigate potential biases that could arise from the non-uniform sampling of participating parties. Group Fairness. Fairness has a wide variety of meanings in literature. Group fairness entails, in particular, that the model should perform comparably across groups defined by sensitive attributes (e.g., sex). It is now common practice to evaluate discrimination in a model (or system) based on quantitative measurements of group fairness. In light of this, we focus on two widely-used group fairness notions, equalized odds (Hardt et al., 2016) and demographic parity (Dwork et al., 2012) . To formally define those fairness notions, we assume each data point is associated with a sensitive attribute a ∈ A, and we use X and Y to denote the input features and the true label. To measure fairness, we use the fairness gap with respect to Equalized Odds Difference, defined as ∆ EO (θ, D) := max a,a ∈A,y∈Y | Pr D (f θ (X) = +|A = a, Y = y) -Pr D (f θ (X) = +|A = a , Y = y)| with an ideal value equal to zero (perfectly fair). Furthermore, in many applications, there exists a favorable prediction from the model (e.g., grant the loan). We assume the positive prediction (+) as the favorable outcome. Demographic parity (Dwork et al., 2012) group fairness notion asks the model to give a favorable label to groups with equal rates. Similarly, the fairness gap with respect to Demographic Parity is defined as follows: ∆ DP (θ, D) := max a,a ∈A | Pr D (f θ (X) = +|A = a) -Pr D (f θ (X) = +|A = a )| with an ideal value equal to zero. In the rest of the paper, we use the fairness gap ∆ (including ∆ EO and ∆ DP ) to measure the bias (fairness) of a model. The more significant fairness gap means the model is more biased (less fair). We will use bias or unfairness interchangeably. Measuring the impact of FL. Federated learning aims to enhance model performance compared to standalone training and achieve comparable performance to centralized training. Thus, in order to evaluate the impact of FL on local fairness, we use centralized training and standalone training as baselines. In standalone training, each party trains a model θ k independently to minimize the loss on its training data Dk . Note that the standalone model's fairness gap for a party is contributed by

