A COGNITIVE-INSPIRED MULTI-MODULE ARCHITEC-TURE FOR CONTINUAL LEARNING

Abstract

Artificial neural networks (ANNs) exhibit a narrow scope of expertise on stationary independent data. However, data in the real world is continuous and dynamic, and ANNs must adapt to novel scenarios while also retaining the learned knowledge to become lifelong learners. The ability of humans to excel at these tasks can be attributed to multiple factors ranging from cognitive computational structures, cognitive biases, and the multi-memory systems in the brain. We incorporate key concepts from each of these to design a cognitive-inspired continual learning method. Cognitive Continual Learner (CCL) includes multiple modules, implicit and explicit knowledge representation dichotomy, inductive bias, and a multi-memory system. CCL shows improvement across different settings and also shows a reduced task recency bias. To test versatility of continual learning methods on a challenging distribution shift, we introduce a novel domain-incremental dataset DN4IL. In addition to improved performance on existing benchmarks, CCL also demonstrates superior performance on this dataset. 1

1. INTRODUCTION

Deep learning has seen rapid progress in recent years, and supervised learning agents have achieved superior performance in perception tasks. However, unlike a supervised setting, where data is static, and independent and identically distributed, real-world data is changing dynamically. Continual learning (CL) aims at learning multiple tasks when data is streamed sequentially (Parisi et al., 2019) . This is crucial in real-world deployment settings, as the model needs to adapt quickly to novel data (plasticity), while also retaining previously learned knowledge (stability). Artificial neural networks (ANN), however, are still not effective continual learners as they often fail to generalize to small changes in distribution and also suffer from forgetting old information when presented with new data (catastrophic forgetting) (McCloskey & Cohen, 1989) . Humans, on the other hand, show a better ability to acquire new skills while also retaining previously learned skills to a greater extent. This intelligence can be attributed to different factors in human cognition. Multiple theories have been proposed to formulate an overall cognitive architecture, which is a broad domain-generic cognitive computation model that captures the essential structure and process of the mind. Some of these theories hypothesize that, instead of a single standalone module, multiple modules in the brain share information to excel at a particular task. CLARION (Connectionist learning with rule induction online) (Sun & Franklin, 2007 ) is one such theory that postulates an integrative cognitive architecture, consisting of a number of distinct subsystems. It predicates a dual representational structure (Chaiken & Trope, 1999) , where the top level encodes conscious explicit knowledge, while the other encodes indirect implicit information. The two systems interact, share knowledge, and cooperate in solving tasks. Delving into these underlying architectures and formulating a new design can help in the quest of building intelligent agents. Multiple modules can be instituted instead of a single feedforward network. An explicit module that learns from the standard visual input and an implicit module that shares indirect contextual knowledge. The implicit module can be further divided into more sub-modules, each providing different information. Inductive biases and semantic memories can act as different kinds of implicit knowledge. Inductive biases are pre-stored templates or knowledge that provide some meaningful disposition toward adapting to the continuously evolving world (Chollet, 2019) . Furthermore, 1 Code and the DN4IL dataset will be made accessible upon acceptance. 1

