MODEL AGNOSTIC META-LEARNING ON TREES

Abstract

In meta-learning, the knowledge learned from previous tasks is transferred to new ones, but this transfer only works if tasks are related, and sharing information between unrelated tasks might hurt performance. A fruitful approach is to share gradients across similar tasks during training, and recent work suggests that the gradients themselves can be used as a measure of task similarity. We study the case in which datasets associated to different tasks have a hierarchical, tree structure. While a few methods have been proposed for hierarchical meta-learning in the past, we propose the first algorithm that is model-agnostic, a simple extension of MAML. As in MAML, our algorithm adapts the model to each task with a few gradient steps, but the adaptation follows the tree structure: in each step, gradients are pooled across task clusters, and subsequent steps follow down the tree. We test the algorithm on linear and non-linear regression on synthetic data, and show that the algorithm significantly improves over MAML. Interestingly, the algorithm performs best when it does not know in advance the tree structure of the data.

1. INTRODUCTION

Deep learning models require a large amount of data in order to perform well when trained from scratch. When data is scarce for a given task, we can transfer the knowledge gained in a source task to quickly learn a target task, if the two tasks are related. The field of Multi-task learning studies how to learn multiple tasks simultaneously, with a single model, by taking advantage of task relationships (Ruder (2017), Zhang & Yang (2018)). However, in Multi-task learning models, a set of tasks is fixed in advance, and they do not generalize to new tasks. The field of of Meta-learning is inspired by the ability of humans to learn how to quickly learn new tasks, by using the knowledge of previously learned ones. Meta-learning has seen a widespread use in multiple domains, especially in recent years and after the advent of Deep Learning (Hospedales et al. ( 2020)). However, there is still a lack of methods for sharing information across tasks in meta-learning models, and the goal of our work is to fill this gap. In particular, a successful model for meta-learning, MAML (Finn et al. ( 2017)), does not diversify task relationships according to their similarity, and it is unclear how to modify it for that purpose. In this work, we contribute the following: • We propose a novel modification of MAML to account for a hierarchy of tasks. The algorithm uses the tree structure of data during adaptation, by pooling gradients across tasks at each adaptation step, and subsequent steps follow down the tree (see Figure 1a ). • We introduce new benchmarks for testing a hierarchy of tasks in meta-learning on a variety of synthetic non-linear (sinusoidal) and multidimensional linear regression tasks. • We compare our algorithm to MAML and a baseline model, where we train on all tasks but without any meta-learning algorithm applied. We show that the algorithm has a better performance with respect to both of these models in the sinusoidal regression task and the newly introduced synthetic task because it exploits the hierarchical structure of the data.

2. RELATED WORK

The problem of quantifying and exploiting task relationships has a long history in Multi-task learning, and is usually approached by parameter sharing, see Ruder (2017), Zhang & Yang (2018) for

