VARIABLE-SHOT ADAPTATION FOR ONLINE META-LEARNING

Abstract

Few-shot meta-learning methods consider the problem of learning new tasks from a small, fixed number of examples, by meta-learning across static data from a set of previous tasks. However, in many real world settings, it is more natural to view the problem as one of minimizing the total amount of supervision -both the number of examples needed to learn a new task and the amount of data needed for metalearning. Such a formulation can be studied in a sequential learning setting, where tasks are presented in sequence. When studying meta-learning in this online setting, a critical question arises: can meta-learning improve over the sample complexity and regret of standard empirical risk minimization methods, when considering both meta-training and adaptation together? The answer is particularly non-obvious for meta-learning algorithms with complex bi-level optimizations that may demand large amounts of meta-training data. To answer this question, we extend previous meta-learning algorithms to handle the variable-shot settings that naturally arise in sequential learning: from many-shot learning at the start, to zero-shot learning towards the end. On sequential learning problems, we find that meta-learning solves the full task set with fewer overall labels and achieves greater cumulative performance, compared to standard supervised methods. These results suggest that meta-learning is an important ingredient for building learning systems that continuously learn and improve over a sequence of problems.

1. INTRODUCTION

Standard machine learning methods typically consider a static training set, with a discrete training phase and test phase. However, in the real world, this process is almost always cyclical: machine learning systems might be improved with the acquisition of new data, repurposed for new tasks via finetuning, or might simply need to be adjusted to suit the needs of a changing, non-stationary world. Indeed, the real world is arguably so complex that, for all practical purposes, learning is never truly finished, and any real system in open-world settings will need to improve and finetune perpetually (Chen & Asch, 2017; Zhao et al., 2019) . In this continual learning process, metalearning provides the appealing prospect of accelerating how quickly new tasks can be acquired using past experience, which in principle should make the learning system more and more efficient over the course of its lifetime. However, current meta-learning methods are typically concerned with asymptotic few-shot performance (Finn et al., 2017; Snell et al., 2017) . For a continual learning system of this sort, we instead need a method that can minimize both the number of examples per task, and the number of tasks needed to accelerate the learning process. Few-shot meta-learning algorithms aim to learn the structure that underlies data coming from a set of related tasks, and use this structure to learn new tasks with only a few datapoints. While these algorithms enable efficient learning for new tasks at test time, it is not clear if these efficiency gains persist in online learning settings, where the efficiency of both meta-training and few-shot adaptation is critical. Indeed, simply training a model on all received data, i.e. standard supervised learning with empirical risk minimization, is a strong competitor since supervised learning methods are known to generalize well to in-distribution tasks in a zero-shot manner. Moreover, it's not clear that meta-learning algorithms can improve over such methods by leveraging shared task structure in online learning settings. Provided that it is possible for a single model to fully master all of the tasks with enough data, both meta-learning and standard empirical risk minimization approaches should produce a model of equal competence. However, the key hypothesis of this work is that meta-learned

