ON THE EFFICACY OF SERVER-AIDED FEDERATED LEARNING AGAINST PARTIAL CLIENT PARTICIPATION Anonymous

Abstract

Although federated learning (FL) has become a prevailing distributed learning framework in recent years due to its benefits in scalability/privacy, there remain many significant challenges in FL system design. Notably, most existing works in the current FL literature assume either full client or uniformly distributed client participation. Unfortunately, this idealistic assumption rarely holds in practice. It has been frequently observed that some clients may never participate in FL training (aka partial/incomplete participation) due to a meld of system heterogeneity factors. To mitigate impacts of partial client participation, an increasingly popular approach in practical FL systems is the sever-aided federated learning (SA-FL) framework, where one equips the server with an auxiliary dataset. However, despite the fact that SA-FL has been empirically shown to be effective in addressing the partial client participation problem, there remains a lack of theoretical understanding for SA-FL. Worse yet, even the ramifications of partial worker participation are not clearly understood in conventional FL so far. These theoretical gaps motivate us to rigorously investigate SA-FL. To this end, we first reveal that conventional FL is not PAC-learnable under partial client participation in the worst case, which advances our understanding of conventional FL. Then, we show that the PAC-learnability of FL with partial client participation can indeed be revived by SA-FL, which theoretically justifies the use of SA-FL for the first time. Lastly, to further make SA-FL communication-efficient, we propose the SAFARI (server-aided federated averaging) algorithm that enjoys convergence guarantees and the same level of communication efficiency and privacy as the state-of-the-art FL.

1. INTRODUCTION

Since the seminal work by McMahan et al. (2017) , federated learning (FL) has emerged as a powerful distributed learning paradigm that enables a large number of clients (e.g., edge devices) to collaboratively train a model under a central server's coordination. However, with FL gaining popularity, it has also become apparent that FL faces a key challenge that is unseen in traditional distributed learning in datacenter settings -system heterogeneity. Generally speaking, system heterogeneity in FL is caused by the massively different computation and communication capabilities at each client (computational power, communication capacity, drop-out rate, etc.). Studies have shown that system heterogeneity can significantly impact client participation in a highly non-trivial fashion and severely degrade the learning performance of FL algorithms (Bonawitz et al., 2019; Yang et al., 2021a) . For example, it is shown in (Yang et al., 2021a ) that more than 30% clients never participate in FL, while only 30% of the clients contribute to 81% of the total computation even if the server uniformly samples the clients. Exacerbating the problem is the fact that the client's status could be unstable and time-varying due to the aforementioned computation and communication constraints. To mitigate the impact of partial client participation, one approach called server-aided federated learning (SA-FL) has been increasingly adopted in practical FL systems in recent years (see, e.g., (Zhao et al., 2018; Wang et al., 2021b) ). The basic idea of SA-FL is to equip the server in FL with a small auxiliary dataset that approximately mimics the population distribution, so that the distribution deviation induced by partial client participation can be corrected. To date, even though SA-FL has been empirically shown to be quite effective in addressing the partial client participation problem in practice, there remains a lack of theoretical understanding for SA-FL. This motivates us to rigorously investigate the efficacy of SA-FL against partial client participation in FL in this paper.

