COMMUNICATION-EFFICIENT AND DRIFT-ROBUST FEDERATED LEARNING VIA ELASTIC NET

Abstract

Federated learning (FL) is a distributed method to train a global model over a set of local clients while keeping data localized. It reduces the risks of privacy and security but faces important challenges including expensive communication costs and client drift issues. To address these issues, we propose FedElasticNet, a communicationefficient and drift-robust FL framework leveraging the elastic net. It repurposes two types of the elastic net regularizers (i.e., ℓ 1 and ℓ 2 penalties on the local model updates): (1) the ℓ 1 -norm regularizer sparsifies the local updates to reduce the communication costs and (2) the ℓ 2 -norm regularizer resolves the client drift problem by limiting the impact of drifting local updates due to data heterogeneity. FedElasticNet is a general framework for FL; hence, without additional costs, it can be integrated into prior FL techniques, e.g., FedAvg, FedProx, SCAFFOLD, and FedDyn. We show that our framework effectively resolves the communication cost and client drift problems simultaneously.

1. INTRODUCTION

Federated learning (FL) is a collaborative method that allows many clients to contribute individually to training a global model by sharing local models rather than private data. Each client has a local training dataset, which it does not want to share with the global server. Instead, each client computes an update to the current global model maintained by the server, and only this update is communicated. FL significantly reduces the risks of privacy and security (McMahan et al., 2017; Li et al., 2020a) , but it faces crucial challenges that make the federated settings distinct from other classical problems (Li et al., 2020a) such as expensive communication costs and client drift problems due to heterogeneous local training datasets and heterogeneous systems (McMahan et al., 2017; Li et al., 2020a; Konečnỳ et al., 2016a; b) . Communicating models is a critical bottleneck in FL, in particular when the federated network comprises a massive number of devices (Bonawitz et al., 2019; Li et al., 2020a; Konečnỳ et al., 2016b) . In such a scenario, communication in the federated network may take a longer time than that of local computation by many orders of magnitude because of limited communication bandwidth and device power (Li et al., 2020a) . To reduce such communication cost, several strategies have been proposed (Konečnỳ et al., 2016b; Li et al., 2020a) . In particular, Konečnỳ et al. (2016b) proposed several methods to form structured local updates and approximate them, e.g., subsampling and quantization. Reisizadeh et al. ( 2020); Xu et al. ( 2020) also proposed an efficient quantization method for FL to reduce the communication cost. Also, in general, as the datasets that local clients own are heterogeneous, trained models on each local data are inconsistent with the global model that minimizes the global empirical loss (Karimireddy et al., 2020; Malinovskiy et al., 2020; Acar et al., 2021) . This issue is referred to as the client drift problem. In order to resolve the client drift problem, FedProx (Li et al., 2020b) 



added a proximal term to a local objective function and regulated local model updates. Karimireddy et al. (2020) proposed SCAFFOLD algorithm that transfers both model updates and control variates to resolve the client drift problem. FedDyn (Acar et al., 2021) dynamically regularizes local objective functions to resolve the client drift problem. Unlike most prior works focusing on either the communication cost problem or the client drift problem, we propose a technique that effectively resolves the communication cost and client drift problems simultaneously.

