

Abstract

While deep networks can learn complex functions such as classifiers, detectors, and trackers, many applications require models that continually adapt to changing input distributions, changing tasks, and changing environmental conditions. Indeed, this ability to continuously accrue knowledge and use past experience to learn new tasks quickly in continual settings is one of the key properties of an intelligent system. For complex and high-dimensional problems, simply updating the model continually with standard learning algorithms such as gradient descent may result in slow adaptation. Meta-learning can provide a powerful tool to accelerate adaptation yet is conventionally studied in batch settings. In this paper, we study how metalearning can be applied to tackle online problems of this nature, simultaneously adapting to changing tasks and input distributions and meta-training the model in order to adapt more quickly in the future. Extending meta-learning into the online setting presents its own challenges, and although several prior methods have studied related problems, they generally require a discrete notion of tasks, with known ground-truth task boundaries. Such methods typically adapt to each task in sequence, resetting the model between tasks, rather than adapting continuously across tasks. In many real-world settings, such discrete boundaries are unavailable, and may not even exist. To address these settings, we propose a Fully Online Meta-Learning (FOML) algorithm, which does not require any ground truth knowledge about the task boundaries and stays fully online without resetting to pre-trained weights. Our experiments show that FOML was able to learn new tasks faster than the state-of-the-art online learning methods on various datasets.

1. INTRODUCTION

Flexibility and rapid adaptation are a hallmark of intelligence: humans can not only solve complex problems, but they can also figure out how to solve them very rapidly, as compared to our current machine learning algorithms. Such rapid adaptation is crucial for both humans and computers: for humans, it is crucial for survival in changing natural environments, and it is also crucial for agents that classify photographs on the Internet, interpret text, control autonomous vehicles, and generally make accurate predictions with rapidly changing real-world data. While deep neural networks are remarkably effective for learning and representing accurate models He et al. ( 2015 Hospedales (2020) , where a constant stream of data from distinct tasks is used for both adaptation and meta-training. In this scheme, meta-training is used to accelerate how quickly the network can adapt to each new task it sees, and simultaneously use that data from each new task for meta-training. This further accelerates how quickly each subsequent task can be acquired. However, current online meta-learning methods fall short of the goal of creating an effective adaptation system for online data in several ways: (1) they typically require task boundaries in the data stream to be known, making them ill-suited to settings where task boundaries are ill-defined and tasks change or evolve gradually, a common tread in real-world; (2) as a result, they typically re-adapt from the meta-trained model 2019)), shown on the left, adaptation is performed on one task a time, and the algorithm "resets" the adaptation process at task boundaries. For example, a MAML-based method would reset the current parameters back to the meta-trained parameters. In our approach (right), knowledge of task boundaries is not required, and the algorithm continually keeps track of online parameters and meta-parameters ✓. The online parameters are simply updated on the latest data, and the meta-parameters are updated to "pull" the online parameters toward fast-adapting solutions via a MAML-style meta-update. on each task, resulting in a very "discrete" mode of operation, where the model adapts to a task, then resets, then adapts to a new one. These limitations restrict the applicability of current online meta-learning methods to real-world settings. We argue that task boundary assumption is somewhat artificial in online settings, where the stream of incoming data is cleanly partitioned into discrete and well-separated tasks presented in sequence. In this paper, we instead develop a fully online meta-learning approach, which does not assume knowledge of task boundaries and does not re-adapt for every new task from the meta parameters. Standard meta-learning methods consist of a meta-training phase, typically done with standard SGD, and an "inner loop" adaptation phase, which computes task specific parameter i for the task T i from a support set to make accurate predictions on a query set. For example, in model-agnostic metalearning (MAML), adaptation consists of taking a few gradient steps on the support set, starting from the meta-trained parameter vector ✓, leading to a set of post-adaptation parameters, and meta-training optimizes the meta-trained parameters ✓ so that these gradient steps lead to good results. Previous extensions of such approaches into the online setting typically observe one task at a time, adapt to that task (i.e., compute post-adaptation parameters on it), and then reset i back to the meta-trained parameters ✓ at the beginning of the next task. Thus, the algorithm repeatedly adapts, resets back to pretrained meta parameters at the task boundary, adapts again, and repeats. This is illustrated in Figure 1 (left). However, in many realistic settings, the task boundaries are not known, and instead the tasks shift gradually over time. The discrete "resetting" procedure is a poor fit in such cases, and we would like to simply continue adapting the weights over time without ever resetting back to the meta-trained parameters, still benefit from a concurrent meta-training process. For example, a metatrained image-tagging model on the Internet (e.g., tagging friends in photographs) might gradually adapt to changing patterns and preferences of its users over time, where it would be unnatural to assume discrete shifts in what users want to tag. Similarly, a traffic prediction system might adapt to changing traffic patterns, including periodic changes due to seasons, and unexpected changes due to shifting economic conditions, weather, and accidents. In this spirit, our method does not require any knowledge on the task boundaries as well as stays fully-online through out the learning. The main contribution of our paper is FOML (fully online meta-learning), an online meta-learning algorithm that continually updates its online parameters with each new datapoint or batch of datapoints, while simultaneously performing meta-gradient updates on a separate set of meta-parameters using a buffer of previously seen data. FOML does not require ground truth knowledge of task boundaries, and does not reset the online parameters back to the meta-parameters between tasks, instead updating the online parameters continually in a fully online fashion. We compare FOML empirically to strong baselines and a state-of-the-art prior online meta-learning method, showing that FOML learns to adapt more quickly, and achieves lower error rates, both on a simple sequential image classification task from prior work and a more complex benchmark that we propose based on the CIFAR100 dataset, with a sequence of 1200 tasks.



RELATED WORKOnline meta-learning brings together ideas from online learning, meta learning, and continual learning, with the aim of adapting quickly to each new task while simultaneously learning how to adapt even more quickly in the future. We discuss these three sets of approaches next.



); Krizhevsky et al. (2012); Simonyan & Zisserman (2014); Szegedy et al. (2015), they are comparatively unimpressive when it comes to adaptability, due to their computational and data requirements. Meta-learning in principle mitigates this problem, by leveraging the generalization power of neural networks to accelerate adaptation to new tasks Finn et al. (2019); Li et al. (2017); Nichol et al. (2018); Nichol & Schulman (2018); Park & Oliva (2019); Antoniou et al. (2018). However, standard meta-learning algorithms operate in batch mode, making them poorly suited for continuously evolving environments. More recently, online meta-learning methods have been proposed with the goal of enabling continual adaptation Finn et al. (2019); Jerfel et al. (2018); Yao et al. (2020); Nagabandi et al. (2018); Li &

Figure 1: Comparison of standard online meta-learning and FOML: In standard online meta-learning (e.g., FTML Finn et al. (2019)), shown on the left, adaptation is performed on one task a time, and the algorithm "resets" the adaptation process at task boundaries. For example, a MAML-based method would reset the current parameters back to the meta-trained parameters. In our approach (right), knowledge of task boundaries is not required, and the algorithm continually keeps track of online parameters and meta-parameters ✓. The online parameters are simply updated on the latest data, and the meta-parameters are updated to "pull" the online parameters toward fast-adapting solutions via a MAML-style meta-update.

