FEW-ROUND LEARNING FOR FEDERATED LEARNING Anonymous

Abstract

Federated learning (FL) presents an appealing opportunity for individuals who are willing to make their private data available for building a communal model without revealing their data contents to anyone else. Of central issues that may limit a widespread adoption of FL is the significant communication resources required in the exchange of updated model parameters between the server and individual clients over many communication rounds. In this work, we focus on preparing an initial model that can limit the number of model exchange rounds in FL to some small fixed number R. We assume that the tasks of the clients participating in FL are not known in the preparing stage. Following the spirit of meta-learning for few-shot learning, we take a meta-learning strategy to prepare the initial model so that once this meta-training phase is over, only R rounds of FL would produce a model that will satisfy the needs of all participating clients. Compared to the metatraining approaches to optimize personalized local models at distributed devices, our method better handles the potential lack of data variability at individual nodes. Extensive experimental results indicate that meta-training geared to few-round learning provides large performance improvements compared to various baselines.

1. INTRODUCTION

Major machine learning applications including computer vision and natural language processing are currently supported by central data centers equipped with massive computing resources and ample training data. At the same time, growing amounts of valuable data are also being collected at distributed edge nodes such as mobile phones, wearable client devices and smart vehicles/drones. Directly sending these local data to the central server for model training raises significant privacy concerns. To address this issue, an emerging trend known as federated learning (McMahan et al., 2017; Konecny et al., 2016; Bonawitz et al., 2019; Li et al., 2019; Zhao et al., 2018; Sattler et al., 2019; Reisizadeh et al., 2019) , where server uploading of local data is not necessary, has been actively researched. Unfortunately, federated learning (FL) generally requires numerous communication rounds between the server and the distributed nodes (or clients) for model exchange, to achieve a desired level of prediction performance. This makes the deployment of FL a significant challenge in bandwidthlimited or time-sensitive applications. Especially in real-time applications (e.g., connected vehicles or drones), where the model should quickly adapt to dynamically evolving environments, the requirement on many communication rounds becomes a major bottleneck. Moreover, the considerable amounts of time and computational resources required for training place a high burden on individual clients wishing to participate in FL. Excessive communication rounds in FL are a major concern especially in light of the increased communication burden for guaranteeing full privacy via secure aggregation (Bonawitz et al., 2017) . To combat this limitation, we focus on preparing an initial model that can quickly obtain a highaccuracy global model within only a few communication rounds between the server and the clients. Following the spirit of meta-learning for few-shot learning, we meta-train the model via episodic training to mimic and tee up for few-round FL. Meta-training enables reliable prediction even when the data sample at hand does not share the same characteristics with the dataset the given model was trained with. In contrast to existing meta-training attempts to initiate a model for further personalized optimizations at local devices, our approach takes advantage of FL's ability to exploit varying data distributions across clients. A high-level description of our idea is depicted in Fig. 1 . Given a small target value R, our goal is to create an initial model that can quickly adapt, within R rounds of FL, to a set of clients with tasks not seen during meta-training. As long as the tasks are different

