FEDGC: AN ACCURATE AND EFFICIENT FEDERATED LEARNING UNDER GRADIENT CONSTRAINT FOR HET-EROGENEOUS DATA

Abstract

Federated Learning (FL) is an important paradigm in large-scale distributed machine learning, which enables multiple clients to jointly learn a unified global model without transmitting their local data to a central server. FL has attracted growing attentions in many real-world applications, such as multi-center cardiovascular disease diagnosis and autonomous driving. Practically, the data across clients are always heterogeneous, i.e., not independently and identically distributed (Non-IID), making the local models suffer from catastrophic forgetting of the initial (or global) model. To mitigate this forgetting issue, existing FL methods may require additional regularization terms or generate pseudo data, resulting to 1) limited accuracy; 2) long training time and slow convergence rate for real-time applications; and 3) high communication cost. In this work, an accurate and efficient Federated Learning algorithm under Gradient Constraints (FedGC) is proposed, which provides three advantages: i) High accuracy is achieved by the proposed Client-Gradient-Constraint based projection method (CGC) to alleviate the forgetting issue occurred in clients, and the proposed Server-Gradient-Constraint based projection method (SGC) to effectively aggregate the gradients of clients; ii) Short training time and fast convergence rate are enabled by the proposed fast Pseudo-gradient-based mini-batch Gradient Descent (PGD) method and SGC; iii) Low communication cost is required due to the fast convergence rate and only gradients are necessary to be transmitted between server and clients. In the experiments, four real-world image datasets with three Non-IID types are evaluated, and five popular FL methods are used for comparison. The experimental results demonstrate that our FedGC not only significantly improves the accuracy and convergence rate on Non-IID data, but also drastically decreases the training time. Compared to the state-of-art FedReg, our FedGC improves the accuracy by up to 14.28% and speeds up the local training time by 15.5 times while decreasing 23% of the communication cost.



). However, in practice, the data across clients are always heterogeneous, i.e., not independently and identically distributed (Non-IID) (Sattler et al., 2020; Zhang et al., 2021b) , which hinders the optimization convergence and generalization performance of FL in real-word applications. At each communication round, a client firstly receives the aggregated knowledge of all clients from the server and then locally trains its model using its own data. If the data are Non-IID across clients, the local optimum of each client can be far from the others after local training and the initial model parameters received from server will be overridden. Hence, the clients will forget the initially received knowledge from



FL) enables multiple participations / clients to collaboratively train a global model while keeping the training data local due to various concerns such as data privacy and real-time processing. FL has attracted growing attention in many real-world applications, such as multi-center cardiovascular disease diagnosis Linardos et al. (2022), Homomorphic Encryptionbased healthcare system Zhang et al. (2022), FL-based real-time autonomous driving Zhang et al. (2021a); Nguyen et al. (2022), FL-based privacy-preserving vehicular navigation Kong et al. (2021), FL-based automatic trajectory prediction Majcherczyk et al. (2021); Wang et al. (

