BETTER GENERATIVE REPLAY FOR CONTINUAL FEDERATED LEARNING

Abstract

Federated Learning (FL) aims to develop a centralized server that learns from distributed clients via communications without accessing the clients' local data. However, existing works mainly focus on federated learning in a single task scenario. with static data. In this paper, we introduce the continual federated learning (CFL) problem, where clients incrementally learn new tasks and history data cannot be stored due to certain reasons, such as limited storage and data retention policy 1 . Generative replay (GR) based methods are effective for continual learning without storing history data. However, we fail when trying to intuitively adapt GR models for this setting. By analyzing the behaviors of clients during training, we find the unstable training process caused by distributed training on non-IID data leads to a notable performance degradation. To address this problem, we propose our FedCIL model with two simple but effective solutions: 1. model consolidation and 2. consistency enforcement. Experimental results on multiple benchmark datasets demonstrate that our method significantly outperforms baselines.

1. INTRODUCTION

Federated learning (McMahan et al., 2017) is an emerging topic in machine learning, where a powerful global model is maintained via communications with distributed clients without access to their local data. A typical challenge in federated learning is the non-IID data distribution (Zhao et al., 2018; Zhu et al., 2021a) , where the data distributions learnt by different clients are different (known as heterogeneous federated learning). Recent methods (Li et al., 2020; Chen & Chao, 2020; Zhu et al., 2021b) gain improvements in the typical federated learning setting, where the global model is learning a single task and each client is trained locally on fixed data. However, in real-world applications, it is more practical that each client is continuously learning new tasks. Traditional federated learning models fail to solve this problem. In practice, history data are sometimes inaccessible considering privacy constraints (e.g., data protection under GDPR) or limited storage space (e.g., mobile devices with very limited space), and the unavailability of previous data often leads to catastrophic forgetting (McCloskey & Cohen, 1989) in many machine learning models. Continual learning (Thrun, 1995; Kumar & Daume III, 2012; Ruvolo & Eaton, 2013; Chu & Li, 2023) aims to develop an intelligent system that can continuously learn from new tasks without forgetting learnt knowledge in the absence of previous data. Common continual learning scenarios can be roughly divided into two scenarios (Van de Ven & Tolias, 2019): task incremental learning (TIL) and class incremental learning (CIL) (Rebuffi et al., 2017) . In both scenarios, the intelligent system is required to solve all tasks so far. In TIL, task-IDs of different task are accessible, while in CIL, they are unavailable, which requires the system to infer task-IDs. The unavailability of task-IDs makes the problem significantly harder. In this paper, we propose a challenging and realistic problem, continual federated learning. More specifically, we aim to deal with the class-incremental federated learning (CI-FL) problem. In this setting, each client is continuously learning new classes from a sequence of tasks, and the centralized server learns from the clients via communications. It is more difficult than the single FL or CIL, because both the non-IID and catastrophic forgetting issues need to be addressed. Compared with traditional continual learning settings that only involve one model, our problem is more complex because there are multiple models including one server and many clients. Owning to the success of generative replay ideas in continual learning, we propose to adopt Auxiliary Classifier GAN (ACGAN) (Odena et al., 2017) , a generative adversarial network (GAN) with an auxiliary classifier, as the base model for the server and clients. Then the generator of the model can be used for generative replay to avoid forgetting. Interestingly, experiments show that a simple combination of ACGAN and FL algorithms fails to produce promising results. It is already known that the unstable training process (Roth et al., 2017; Kang et al., 2021; Wu et al., 2020) and imbalanced data can worsen the performance of generative models. We find that this phenomenon becomes more severe in the context of federated learning, which leads to the failure of the intuitively combined model. To overcome these challenges, we propose a federated class-incremental learning (FedCIL) framework. In FedCIL, the generator of ACGAN helps alleviate catastrophic forgetting by generating synthetic data of previous distributions for replay, meanwhile, it benefits federated learning by transferring better global and local data distributions during the communications. Our model is also subject to the privacy constraints in federated learning. During the communications, only model parameters are transmitted. With the proposed global model consolidation and local consistency enforcement, our model significantly outperforms baselines on benchmark datasets. The main contributions of this paper are as follows: • We introduce a challenging and practical problem of continual federated learning, i.e., class-incremental federated learning (CI-FL), where a global model continuously learns from multiple clients that are incrementally learning new classes without memory buffers. Subsequently, we propose generative replay (GR) based methods for this challenge. • We empirically find the unstable learning process caused by distributed training on highly non-IID data with popular federated learning algorithms can lead to a notable performance degradation of GR based models. Motivated by this observation, we further propose to solve the problem with model consolidation and consistency enforcement. • We design new experimental settings and conduct comprehensive evacuations on benchmark datasets. The results demonstrate the effectiveness of our model.

2. RELATED WORK

Class-Incremental Learning. Class-incremental learning (CIL) (Rebuffi et al., 2017 ) is a hard continual learning problem due to the unavailability of the task-IDs. Existing approaches to solve the CIL problem can be divided into three categories (Ebrahimi et al., 2020) , including the replaybased methods (Rolnick et al., 2019; Chaudhry et al., 2019 ), structure-based methods (Yoon et al., 2017) , and regularization-based methods (Kirkpatrick et al., 2017; Aljundi et al., 2018) . In addition, the generative replay based methods (Shin et al., 2017; Wu et al., 2018) belong to the replay-based methods, but they do not need a replay buffer for storing previous data. Instead, an additional generator is trained to capture the data distribution of data learnt so far, and it generates synthetic data for replay when the real data becomes unavailable. The generative replay based methods have obtained promising results on various CIL benchmarks, where storing data is not allowed (data-free). Federated Learning. Federated learning has been extensively studied in recent years. We mainly discuss the ones that involve generative models in this section. For instance, some recent federated learning methods train a GAN across distributed resources (Zhang et al., 2021; Rasouli et al., 2020) in a federated learning paradigm. They are subject to privacy constraints in that the parameters of the GAN are shared instead of the real data. In our work, we also strictly follow the privacy constraints by only transmitting GAN parameters. 



https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32016R0679



Learning and Federated Learning. So far, only few works lie in the intersection of federated learning and continual learning. Casado et al. (2020) discussed federated learning with changing data distributions, but it only involves single-task scenario of federated learning. Yoon et al. (2021) introduced a federated continual learning setting from the perspective of continual learning. Our work is significantly different from it. First, in (Yoon et al., 2021), the federated

availability

https://github.com/daiqing98/FedCIL.

