COMMUNICATION-COMPUTATION EFFICIENT SECURE AGGREGATION FOR FEDERATED LEARNING Anonymous

Abstract

Federated learning has been spotlighted as a way to train neural network models using data distributed over multiple clients without a need to share private data. Unfortunately, however, it has been shown that data privacy could not be fully guaranteed as adversaries may be able to extract certain information on local data from the model parameters transmitted during federated learning. A recent solution based on the secure aggregation primitive enables privacy-preserving federated learning, but at the expense of significant extra communication/computational resources. In this paper, we propose communication-computation efficient secure aggregation which reduces the amount of communication/computational resources at least by a factor of n/ log n relative to the existing secure solution without sacrificing data privacy, where n is the number of clients. The key idea behind the suggested scheme is to design the topology of the secret-sharing nodes (denoted by the assignment graph G) as sparse random graphs instead of the complete graph corresponding to the existing solution. We first obtain a sufficient condition on G to guarantee reliable and private federated learning. Afterwards, we suggest using the Erdős-Rényi graph as G, and provide theoretical guarantees on the reliability/privacy of the proposed scheme. Through extensive real-world experiments, we demonstrate that our scheme, using only 50% of the resources required in the conventional scheme, maintains virtually the same levels of reliability and data privacy in practical federated learning systems.

1. INTRODUCTION

Federated learning (McMahan et al., 2017) has been considered as a promising framework for training models in a decentralized manner without explicitly sharing the local private data. This framework is especially useful in various predictive models which learn from private distributed data, e.g., healthcare services based on medical data distributed over multiple organizations (Brisimi et al., 2018; Xu & Wang, 2019 ) and text prediction based on the messages of distributed clients (Yang et al., 2018; Ramaswamy et al., 2019) . In the federated learning (FL) setup, each device contributes to the global model update by transmitting its local model only; the private data is not shared across the network, which makes FL highly attractive (Kairouz et al., 2019; Yang et al., 2019) . Unfortunately, however, FL could still be vulnerable against the adversarial attacks on the data leakage. Specifically, the local model transmitted from a device contains extensive information on the training data, and an eavesdropper can estimate the data owned by the target device (Fredrikson et al., 2015; Shokri et al., 2017; Melis et al., 2019) . Motivated by this issue, the authors of (Bonawitz et al., 2017) suggested secure aggregation (SA), which integrates cryptographic primitives into the FL framework to protect data privacy. However, SA requires significant amounts of additional resources on communication and computing for guaranteeing privacy. Especially, the communication and computation burden of SA increases as a quadratic function of the number of clients, which limits the scalability of SA. Contributions We propose communication-computation efficient secure aggregation (CCESA), which maintains the reliability and data privacy in federated learning, with reduced resources on communication and computation compared to conventional SA. Our basic idea is illustrated in Fig. 1 

