FEDERATED CONTINUAL LEARNING WITH WEIGHTED INTER-CLIENT TRANSFER

Abstract

There has been a surge of interest in continual learning and federated learning, both of which are important in deep neural networks in real-world scenarios. Yet little research has been done regarding the scenario where each client learns on a sequence of tasks from a private local data stream. This problem of federated continual learning poses new challenges to continual learning, such as utilizing knowledge from other clients, while preventing interference from irrelevant knowledge. To resolve these issues, we propose a novel federated continual learning framework, Federated Weighted Inter-client Transfer (FedWeIT), which decomposes the network weights into global federated parameters and sparse task-specific parameters, and each client receives selective knowledge from other clients by taking a weighted combination of their task-specific parameters. FedWeIT minimizes interference between incompatible tasks, and also allows positive knowledge transfer across clients during learning. We validate our FedWeIT against existing federated learning and continual learning methods under varying degrees of task similarity across clients, and our model significantly outperforms them with a large reduction in the communication cost.

1. INTRODUCTION

Continual learning (Thrun, 1995; Kumar & Daume III, 2012; Ruvolo & Eaton, 2013; Kirkpatrick et al., 2017; Schwarz et al., 2018) describes a learning scenario where a model continuously trains on a sequence of tasks; it is inspired by the human learning process, as a person learns to perform numerous tasks with large diversity over his/her lifespan, making use of the past knowledge to learn about new tasks without forgetting previously learned ones. Continual learning is a long-studied topic since having such an ability leads to the potential of building a general artificial intelligence. However, there are crucial challenges in implementing it with conventional models such as deep neural networks (DNNs), such as catastrophic forgetting, which describes the problem where parameters or semantic representations learned for the past tasks drift to the direction of new tasks during training. The problem has been tackled by various prior work (Kirkpatrick et al., 2017; Lee et al., 2017; Shin et al., 2017; Riemer et al., 2019) . More recent works tackle other issues, such as scalability or order-robustness (Schwarz et al., 2018; Hung et al., 2019; Yoon et al., 2020) . However, all of these models are fundamentally limited in that the models can only learn from its direct experience -they only learn from the sequence of the tasks they have trained on. Contrarily, humans can learn from indirect experience from others, through different means (e.g. verbal communications, books, or various media). Then wouldn't it be beneficial to implement such an ability to a continual learning framework, such that multiple models learning on different machines can learn from the knowledge of the tasks that have been already experienced by other clients? One problem that arises here, is that due to data privacy on individual clients and exorbitant communication cost, it may not be possible to communicate data directly between the clients or between the server and clients. Federated learning (McMahan et al., 2016; Li et al., 2018; Yurochkin et al., 2019) is a learning paradigm that tackles this issue by communicating the parameters instead of the raw data itself. We may have a server that receives the parameters locally trained on multiple clients, aggregates it into a single model parameter, and sends it back to the clients. Motivated by our intuition on learning from indirect experience, we tackle the problem of Federated Continual Learning (FCL) where we perform continual learning with multiple clients trained on private task sequences, which communicate their task-specific parameters via a global server. Figure 1 (a) depicts an example 1

