FEDREP: A BYZANTINE-ROBUST, COMMUNICATION-EFFICIENT AND PRIVACY-PRESERVING FRAMEWORK FOR FEDERATED LEARNING Anonymous authors Paper under double-blind review

Abstract

Federated learning (FL) has recently become a hot research topic, in which Byzantine robustness, communication efficiency and privacy preservation are three important aspects. However, the tension among these three aspects makes it hard to simultaneously take all of them into account. In view of this challenge, we theoretically analyze the conditions that a communication compression method should satisfy to be compatible with existing Byzantine-robust methods and privacy-preserving methods. Motivated by the analysis results, we propose a novel communication compression method called consensus sparsification (ConSpar). To the best of our knowledge, ConSpar is the first communication compression method that is designed to be compatible with both Byzantine-robust methods and privacypreserving methods. Based on ConSpar, we further propose a novel FL framework called FedREP, which is Byzantine-robust, communication-efficient and privacypreserving. We theoretically prove the Byzantine robustness and the convergence of FedREP. Empirical results show that FedREP can significantly outperform communication-efficient privacy-preserving baselines. Furthermore, compared with Byzantine-robust communication-efficient baselines, FedREP can achieve comparable accuracy with an extra advantage of privacy preservation.

1. INTRODUCTION

Federated learning (FL), in which participants (also called clients) collaborate to train a learning model while keeping data privately-owned, has recently become a hot research topic (Konevcnỳ et al., 2016; McMahan & Ramage, 2017) . Compared to traditional data-center based distributed learning (Haddadpour et al., 2019; Jaggi et al., 2014; Lee et al., 2017; Lian et al., 2017; Shamir et al., 2014; Sun et al., 2018; Yu et al., 2019a; Zhang & Kwok, 2014; Zhao et al., 2017; 2018; Zhou et al., 2018; Zinkevich et al., 2010) , service providers have less control over clients and the network is usually less stable with smaller bandwidth in FL applications. Furthermore, participants will also take the risk of privacy leakage in FL if privacy-preserving methods are not used. Consequently, Byzantine robustness, communication efficiency and privacy preservation have become three important aspects of FL methods (Kairouz et al., 2021) and have attracted much attention in recent years. Byzantine robustness. In FL applications, failure in clients or network transmission may not get discovered and resolved in time (Kairouz et al., 2021) . Moreover, some clients may get attacked by an adversarial party, sending incorrect or even harmful information purposely. The clients in failure or under attack are also called Byzantine clients. To obtain robustness against Byzantine clients, there are mainly three different ways, which are known as redundant computation, server validation and robust aggregation, respectively. Redundant computation methods (Chen et al., 2018; Konstantinidis & Ramamoorthy, 2021; Rajput et al., 2019) require different clients to compute gradients for the same training instances. These methods are mostly for traditional data-center based distributed learning, but unavailable in FL due to the privacy principle. In server validation methods (Xie et al., 2019b; 2020b) , server validates clients' updates based on a public dataset. However, the performance of server validation methods depends on the quantity and quality of training instances. In many scenarios, it is hard to obtain a large-scale high-quality public dataset. The third way is to replace the mean aggregation on server with robust aggregation (Alistarh et al., 2018; Bernstein et al., 2019; Blanchard et al., 2017; Chen et al., 2017; Ghosh et al., 2020; Karimireddy et al., 2021; Li et al., 

