FEDMES: SPEEDING UP FEDERATED LEARNING WITH MULTIPLE EDGE SERVERS

Abstract

We consider federated learning with multiple wireless edge servers having their own local coverages. We focus on speeding up training in this increasingly practical setup. Our key idea is to utilize the devices located in the overlapping areas between the coverage of edge servers; in the model-downloading stage, the devices in the overlapping areas receive multiple models from different edge servers, take the average of the received models, and then update the model with their local data. These devices send their updated model to multiple edge servers by broadcasting, which acts as bridges for sharing the trained models between servers. Even when some edge servers are given biased datasets within their coverages, their training processes can be assisted by coverages of adjacent servers, through the devices in the overlapping regions. As a result, the proposed scheme does not require costly communications with the central cloud server (located at the higher tier of edge servers) for model synchronization, significantly reducing the overall training time compared to the conventional cloud-based federated learning systems. Extensive experimental results show remarkable performance gains of our scheme compared to existing methods.

1. INTRODUCTION

With the explosive growth in the numbers of smart phones, wearable devices and Internet of Things (IoT) sensors, a large portion of data generated nowadays is collected outside the cloud, especially at the distributed end-devices at the edge. Federated learning (McMahan et al., 2017; Konecny et al., 2016b; a; Bonawitz et al., 2019; Li et al., 2019a ) is a recent paradigm for this setup, which enables training of a machine learning model in a distributed network while significantly resolving privacy concerns of the individual devices. However, training requires repeated downloading and uploading of the models between the parameter server (PS) and devices, presenting significant challenges in terms of 1) the communication bottleneck at the PS and 2) the nonIID (independent, identically distributed) data characteristic across devices (Zhao et al., 2018; Sattler et al., 2019; Li et al., 2019b; Reisizadeh et al., 2019; Jeong et al., 2018) . In federated learning, the PS can be located at the cloud or at the edge (e.g., small base stations). Most current studies on federated learning consider the former, with the assumption that millions of devices are within the coverage of the PS at the cloud; at every global round, the devices in the system should communicate with the PS (located at the cloud) for downloading and uploading the models. However, an inherent limitation of this cloud-based system is the long distance between the device and the cloud server, which causes significant propagation delay during model downloading/uploading stages in federated learning (Mao et al., 2017; Nguyen et al., 2019) . Specifically, it is reported in (Mao et al., 2017) that the supportable latency (for inference) of cloud-based systems is larger than 100 milliseconds, while the edge-based systems have supportable latency of less than tens of milliseconds. This large delay between the cloud and the devices directly affects the training time of cloud-based federated learning systems. In order to support latency-sensitive applications (e.g., smart cars) or emergency events (e.g., disaster response by drones) by federated learning, utilization of edge-based system is absolutely necessary. An issue, however, is that although the edge-based federated learning system can considerably reduce the latency between the PS and the devices, the coverage of an edge server is generally limited in practical systems (e.g., wireless cellular networks); there are insufficient number of devices within the coverage of an edge server for training a global model with enough accuracy. Accordingly, the limited coverage of a single edge server could include biased datasets and thus could lead to a biased model after training. Thus in practice, performing federated learning with the devices in a single edge server would result in a significant performance degradation. Main contributions. To overcome the above practical challenges, we propose FedMes, a novel federated learning algorithm highly tailored to the environment with multiple edge servers (ESs). Our idea here is to utilize the devices located in the overlapping areas between the coverage of ESs, which are typical in 5G and beyond systems with dense deployment of ESs. In the model-downloading stage, each ES sends the current model to the devices in its coverage area; in this process the devices in the overlapped region receive multiple models from different ESs. These devices in the overlapping area take the average of the received models, and then update the model based on their local data. Then each device sends its updated model to the corresponding ES or ESs, which is aggregated at each ES. A high-level description of FedMes is given in Fig. 1 .

Proposed Algorithm

For example, suppose that device k is located in the non-overlapped region of ES i while device l is in the overlapped region between ES i and ES j. In conventional federated learning systems, device l participates in the training process of only one of ES i or ES j; on the other hand, in FedMes, device l can act as a bridge for sharing the trained models between both ESs. To be specific, the updated model of device k is averaged only at its associated ES i. In the next step, this averaged model is sent to the devices in its covered area, including device l. After the local model updates at the devices, device l sends its updated model to both ES i and ES j. From this point of view, even when some training samples are only in the coverage of a specific ES, these data can still assist the training process of other servers. Hence, the proposed scheme does not require costly communications with the central cloud server (located at the higher tier of ESs) for model synchronization, significantly reducing the overall training time compared to cloud-based federated learning systems. Comparing with the scheme which does not consider the overlapping areas, FedMes can provide a significant performance gain, especially when the data distributions across coverages of different servers are nonIID, e.g., when a specific server has a biased dataset within its covered area. Especially in this nonIID setup, giving more weights to the devices located in the overlapping areas (in each aggregation step at the ESs) can further speed up training. From the service provider point of view, FedMes does not require any backhaul traffic between the ESs and the cloud server, significantly reducing the communication resources required for federated



Figure 1: FedMes: the proposed federated learning algorithm leveraging multiple edge servers (ESs). The devices in the overlapping areas can act as bridges for sharing the trained models between ESs. When a specific ES is given a biased dataset within its coverage, giving more weights to the devices located in the overlapping regions (in each aggregation step at the ESs) can further speed up training. Our design targets latency-sensitive applications where edge-based federated learning is essential, e.g., when a number of cars/drones should quickly adapt to the current situation by cooperation (federated learning) and make the right decision.

