LIFELONG LEARNING OF COMPOSITIONAL STRUCTURES

Abstract

A hallmark of human intelligence is the ability to construct self-contained chunks of knowledge and adequately reuse them in novel combinations for solving different yet structurally related problems. Learning such compositional structures has been a significant challenge for artificial systems, due to the combinatorial nature of the underlying search problem. To date, research into compositional learning has largely proceeded separately from work on lifelong or continual learning. We integrate these two lines of work to present a general-purpose framework for lifelong learning of compositional structures that can be used for solving a stream of related tasks. Our framework separates the learning process into two broad stages: learning how to best combine existing components in order to assimilate a novel problem, and learning how to adapt the set of existing components to accommodate the new problem. This separation explicitly handles the trade-off between the stability required to remember how to solve earlier tasks and the flexibility required to solve new tasks, as we show empirically in an extensive evaluation.

1. INTRODUCTION

A major goal of artificial intelligence is to create an agent capable of acquiring a general understanding of the world. Such an agent would require the ability to continually accumulate and build upon its knowledge as it encounters new experiences. Lifelong machine learning addresses this setting, whereby an agent faces a continual stream of diverse problems and must strive to capture the knowledge necessary for solving each new task it encounters. If the agent is capable of accumulating knowledge in some form of compositional representation (e.g., neural net modules), it could then selectively reuse and combine relevant pieces of knowledge to construct novel solutions. Various compositional representations for multiple tasks have been proposed recently (Zaremba et al., 2016; Hu et al., 2017; Kirsch et al., 2018; Meyerson & Miikkulainen, 2018) . We address the novel question of how to learn these compositional structures in a lifelong learning setting. We design a general-purpose framework that is agnostic to the specific algorithms used for learning and the form of the structures being learned. Evoking Piaget's (1976) assimilation and accommodation stages of intellectual development, this framework embodies the benefits of dividing the lifelong learning process into two distinct stages. In the first stage, the learner strives to solve a new task by combining existing components it has already acquired. The second stage uses discoveries from the new task to improve existing components and to construct fresh components if necessary. Our proposed framework, which we depict visually in Appendix A, is capable of incorporating various forms of compositional structures, as well as different mechanisms for avoiding catastrophic forgetting (McCloskey & Cohen, 1989) . As examples of the flexibility of our framework, it can incorporate naïve fine-tuning, experience replay, and elastic weight consolidation (Kirkpatrick et al., 2017) as knowledge retention mechanisms, and linear combinations of linear models (Kumar & Daumé III, 2012; Ruvolo & Eaton, 2013) , soft layer ordering (Meyerson & Miikkulainen, 2018) , and a soft version of gating networks (Kirsch et al., 2018; Rosenbaum et al., 2018) as the compositional structures. We instantiate our framework with the nine combinations of these examples, and evaluate it on eight different data sets, consistently showing that separating the lifelong learning process into two stages increases the capabilities of the learning system, reducing catastrophic forgetting and achieving higher overall performance. Qualitatively, we show that the components learned by an algorithm that adheres to our framework correspond to self-contained, reusable functions.

