FUNCTION CONTRASTIVE LEARNING OF TRANSFER-ABLE REPRESENTATIONS

Abstract

Few-shot-learning seeks to find models that are capable of fast-adaptation to novel tasks which are not encountered during training. Unlike typical few-shot learning algorithms, we propose a contrastive learning method which is not trained to solve a set of tasks, but rather attempts to find a good representation of the underlying datagenerating processes (functions). This allows for finding representations which are useful for an entire series of tasks sharing the same function. In particular, our training scheme is driven by the self-supervision signal indicating whether two sets of samples stem from the same underlying function. Our experiments on a number of synthetic and real-world datasets show that the representations we obtain can outperform strong baselines in terms of downstream performance and noise robustness, even when these baselines are trained in an end-to-end manner.

1. INTRODUCTION

The ability to learn new concepts from only a few examples is a salient characteristic of intelligent behaviour. Nevertheless, contemporary machine learning models consume copious amounts of data to learn even seemingly basic concepts. The mitigation of this issue is the ambition of the few-shot learning framework, wherein a fundamental objective is to learn representations that apply to a variety of different problems (Bengio et al., 2019) . In this work, we propose a self-supervised method for learning such representations by leveraging the framework of contrastive learning. We consider a setting very similar to the one in Neural Processes (NPs) (Garnelo et al., 2018a; b; Kim et al., 2019) : The goal is to solve some task related to an unknown function f after observing just a few input-output examples O f = {(x i , y i )} i . For instance, the task may consist of predicting the function value y at unseen locations x, or it may be to classify images after observing only a few pixels (in that case x is the pixel location and y is the pixel value). To solve such a task, the example dataset O f needs to be encoded into some representation of the underlying function f . Finding a good representation of a function which facilitates solving a wide range of tasks, sharing the same function, is the object of the present paper. Most existing methods approach this problem by optimizing representations in terms of reconstruction i.e., prediction of function values y at unseen locations x (see e.g. NPs (Garnelo et al., 2018a; b; Kim et al., 2019) and Generative Query Networks (GQNs) (Eslami et al., 2018) ). A problem with this objective is that it can cause the model to waste its capacity on reconstructing unimportant features, such as static backgrounds, while ignoring the visually small but important details in its learned representation (Anand et al., 2019; Kipf et al., 2019) . For instance, in order to manipulate a small object in a complex scene, the model's ability to infer the object's shape carries more importance than inferring its color or reconstructing the static background. To address this issue, we propose an approach which contrasts functions, rather than attempting to reconstruct them. The key idea is that two sets of examples of the same function should have similar latent representations, while the representations of different functions should be easily distinguishable. To this end, we propose a novel contrastive learning framework which learns by contrasting sets of input-output pairs (partial observations) of different functions. We show that this self-supervised training signal allows the model to meta-learn task-agnostic, low-dimensional representations of functions which are not only robust to noise but can also be reliably used for a variety of fewshot downstream prediction tasks defined on those functions. To evaluate the effectiveness of the proposed method, we conduct comprehensive experiments on diverse downstream problems including classification, regression, parameter identification, scene understanding and reinforcement learning.

