TASK-AGNOSTIC ONLINE META-LEARNING IN NON-STATIONARY ENVIRONMENTS

Abstract

Online meta-learning has recently emerged as a marriage between batch metalearning and online learning, for achieving the capability of quick adaptation on new tasks in a lifelong manner. However, most existing approaches focus on the restrictive setting where the distribution of the online tasks remains fixed with known task boundaries. In this work we relax these assumptions and propose a novel algorithm for task-agnostic online meta-learning in non-stationary environments. More specifically, we first propose two simple but effective detection mechanisms of task switches and distribution shift based on empirical observations, which serve as a key building block for more elegant online model updates in our algorithm: the task switch detection mechanism allows reusing of the best model available for the current task at hand, and the distribution shift detection mechanism differentiates the meta model update so as to preserve the knowledge for in-distribution tasks and quickly learn the new knowledge for out-of-distribution tasks. Motivated by the recent advance in online learning, our online meta model updates are based only on the current data, which eliminates the need of storing previous data as required in most existing methods. This crucial choice is also well supported by our theoretical analysis of dynamic regret in online meta-learning, where a sublinear regret can be achieved by updating the meta model at each round using the current data only. Empirical studies on three different benchmarks clearly demonstrate the significant advantage of our algorithm over related baseline approaches.

1. INTRODUCTION

Two key aspects of human intelligence are the abilities to quickly learn complex tasks and continually update their knowledge base for faster learning of future tasks. Meta-learning (Koch et al., 2015; Ravi & Larochelle, 2016; Finn et al., 2017) and online learning (Hannan, 1957; Shalev-Shwartz & Singer, 2007; Cesa-Bianchi & Lugosi, 2006) are two main research directions that try to equip learning agents with these abilities. In particular, meta-learning aims to facilitate quick learning of new unseen tasks by building a prior over model parameters based on the knowledge of related tasks, whereas online learning deals with the problem where the task data is sequentially revealed to a learning agent. To achieve the capability of fast adaptation on new tasks in a lifelong manner, online meta-learning (Finn et al., 2017; Harrison et al., 2020; Yao et al., 2020) has attracted much attention recently. Considering the setup where online tasks arrive one at a time, the objective of online meta-learning is to continuously update the meta prior based on which the new task can be learnt more quickly after the agent encounters more tasks. In online meta-learning, the agent typically maintains two separate models, i.e., the meta-model to capture the underlying common knowledge across tasks and the online task model for solving the current task in hand. Most of the existing studies (Finn et al., 2017; Acar et al., 2021) in online meta-learning follow a "resetting" strategy: quickly adapt the online task model from the meta model using the current data, update the meta model and reset the online task model back to the updated meta model at the beginning of the next task. This strategy generally works well when the task boundaries are known and the task distribution remains stationary. However, in many real-world data streams the task boundaries are not directly visible to the agent (Rajasegaran et al., 2022; Caccia et al., 2020; Harrison et al., 2020) , and the task distributions can dynamically change during the online learning stage. Therefore, in this work we seek to solve the online meta-learning problem in such more realistic settings.

