FOCUS: FAIRNESS VIA AGENT-AWARENESS FOR FEDERATED LEARNING ON HETEROGENEOUS DATA Anonymous authors Paper under double-blind review

Abstract

Federated learning (FL) provides an effective collaborative training paradigm, allowing local agents to train a global model jointly without sharing their local data to protect privacy. However, due to the heterogeneous nature of local data, it is challenging to optimize or even define fairness of the trained global model for the agents. For instance, existing work usually considers accuracy equity as fairness for different agents in FL, which is limited, especially under the heterogeneous setting, since it is intuitively "unfair" to enforce agents with high-quality data (e.g., hospitals with high-resolution data and fine-grained labels) to achieve similar accuracy to those who contribute low-quality data (e.g., hospitals with low-resolution data and noisy labels), which may discourage the agents from participating in FL. In this work, we aim to address such limitations and propose a formal fairness definition in FL, fairness via agent-awareness (FAA), which takes different contributions of heterogeneous agents into account. Under FAA, the performance of agents with high-quality data will not be sacrificed just due to the existence of large amounts of agents with low-quality data. In addition, we propose a fair FL training algorithm based on agent clustering (FOCUS) to achieve fairness in FL measured by FAA. Theoretically, we prove the convergence and optimality of FOCUS under mild conditions for linear and general convex loss functions with bounded smoothness. We also prove that FOCUS always achieves higher fairness in terms of FAA compared with standard FedAvg under both linear and general convex loss functions. Empirically, we evaluate FOCUS on four datasets, including synthetic data, images, and texts under different settings, and we show that FOCUS achieves significantly higher fairness in terms of FAA while maintaining similar or even higher prediction accuracy compared with FedAvg and other existing fair FL algorithms.

1. INTRODUCTION

Federated learning (FL) is emerging as a promising approach to enable scalable intelligence over distributed settings such as mobile networks (Lim et al., 2020; Hard et al., 2018) . Given the wide adoption of FL, including medical analysis (Sheller et al., 2020; Adnan et al., 2022) , recommendation systems (Minto et al., 2021; Anelli1 et al., 2021) , and personal Internet of Things (IoT) devices (Alawadi et al., 2021) , how to ensure the fairness of the trained global model in FL is of great importance before its large-scale deployment, especially when the data quality/contributions of different agents are different in the heterogeneous setting. In general, fairness is defined as the protection of a specific attribute, and fair FL is usually in the form of equity, which means that each individual that joins collaborative learning would not suffer from bad performance due to their identity. Several studies have explored fairness in FL, which mainly focus on the fairness of the final trained model regarding the protected attributes without considering different contributions of agents (Chu et al., 2021; Hu et al., 2022) or the accuracy parity across agents (Li et al., 2020b; Donahue & Kleinberg, 2022a; Mohri et al., 2019) . Some works have considered the properties of local agents, such as the local data properties (Zhang et al., 2020; Kang et al., 2019) and data size (Donahue & Kleinberg, 2022b). However, the fairness analysis in FL under heterogeneous data distributions is still lacking. Thus, in this paper, we aim to ask: What is the fairness of FL that is able to take different contributions of heterogeneous local agents into account? Can we enhance the fairness of FL by providing advanced training algorithms? To better understand the fairness of FL under heterogeneous data, in this work, we aim to define and enhance fairness by explicitly considering different contributions of heterogeneous agents. In particular, for FL trained with standard FedAvg protocol (McMahan et al., 2017) , if we denote the data of agent e as D e with size n e and the total number of data as n, the final trained global model aims to minimize the loss with respect to the global distribution P = E e=1 ne n D e , where E is the total number of agents. In practice, some local agents may have low-quality data (e.g., free riders), so intuitively it is "unfair" to train the final model regarding such global distribution over all agents, which will sacrifice the performance of agents with high-quality data. For example, considering the FL applications for medical analysis, some hospitals have high-resolution medical data and finegrained labels, which cost a large amount of money to collect the data from advanced equipment and to crowdsource data labeling. In contrast, some hospitals may have low-resolution medical data and noisy labels. In such a setting, high-quality agents may not be willing to participate in collaborative learning with low-quality agents because they could have achieved higher accuracy by standalone local training. Therefore, a proper fairness notion is important to encourage agents to participate in FL and ensure fairness. In this paper, we define fairness via agent-awareness in FL (FAA) as FAA({θ e } e∈[E] ) = max e1,e2∈E E e1 (θ e1 ) -E e2 (θ e2 ) , measured by the maximal excess risk difference between any pair of agents e 1 , e 2 ∈ E. The excess risk of each agent is calculated as E e (θ e ) = L e (θ e ) -min θ * L e (θ * ), which stands for the loss of user e evaluated on the FL model θ e subtracted by the Bayes optimal error of the local data distribution (Opper & Haussler, 1991) . For each agent, a lower excess risk E e (θ e ) indicates more gain from the FL model θ e w.r.t the local distribution because its loss L e (θ e ) is more closed to its Bayes optimal error. Notably, reducing FAA enforces the equity of excess risks among agents, following the philosophy that each agent should "gain the same" from participating in FL. Therefore, lower FAA indicates stronger fairness for FL. Based on our fairness definition FAA, we then propose a fair FL algorithm based on agent clustering (FOCUS) to improve the fairness of FL. Specifically, we first cluster the local agents based on their data distributions and then train a model for each cluster. During inference time, the final prediction will be the weighted aggregation over the prediction result of each model trained with the corresponding clustered local data. Theoretically, we prove that the final converged stationary point of FOCUS is exponentially close to the optimal cluster assignment under mild conditions. In addition, we prove that the fairness of FOCUS in terms of FAA is strictly higher than that of the standard FedAvg under both linear models and general convex losses. Empirically, we evaluate FOCUS on four datasets, including synthetic data, images, and texts, and we show that FOCUS achieves higher fairness measured by FAA than FedAvg and SOTA fair FL algorithms while maintaining similar or even higher prediction accuracy. Technical contributions. In this work, we define and improve FL fairness in heterogeneous settings by considering different contributions of heterogeneous local agents. We make contributions on theoretical and empirical fronts. • We formally define fairness via agent-awareness (FAA) in FL based on agent-level excess risks to measure fairness in FL, and explicitly take the heterogeneity nature of local agents into account. • We propose a fair FL algorithm via agent clustering (FOCUS) to improve fairness measured by FAA, especially in the heterogeneous setting. We prove the convergence rate and optimality of FOCUS under linear models and general convex losses. • We prove that FOCUS achieves stronger fairness measured by FAA compared with FedAvg for both linear models and general convex losses. • Empirically, we compare FOCUS with FedAvg and SOTA fair FL algorithms on four datasets, including synthetic data, images, and texts under heterogeneous settings. We show that FOCUS indeed achieves stronger fairness measured by FAA while maintaining similar or even higher prediction accuracy on all datasets. 



Federated Learning There have been several studies exploring fairness in FL. Li et al. (2020b) first define agent-level fairness by considering accuracy equity across agents and achieve fairness by assigning the agents with worse performance with higher aggregation weight during training. However, such a definition of fairness fails to capture the heterogeneous nature of local agents. Mohri et al. (2019) pursue accuracy parity by improving the performance of the worstperforming agent. Wang et al. (2021) propose to mitigate conflict gradients from local agents to enhance fairness. Instead of pursuing fairness with one single global model, Li et al. (2021) propose to train a personalized model for each agent to achieve accuracy equity for the personalized models.

