CONTRASTIVE META-LEARNING FOR PARTIALLY OBSERVABLE FEW-SHOT LEARNING

Abstract

Many contrastive and meta-learning approaches learn representations by identifying common features in multiple views. However, the formalism for these approaches generally assumes features to be shared across views to be captured coherently. We consider the problem of learning a unified representation from partial observations, where useful features may be present in only some of the views. We approach this through a probabilistic formalism enabling views to map to representations with different levels of uncertainty in different components; these views can then be integrated with one another through marginalisation over that uncertainty. Our approach, Partial Observation Experts Modelling (POEM), then enables us to meta-learn consistent representations from partial observations. We evaluate our approach on an adaptation of a comprehensive few-shot learning benchmark, Meta-Dataset, and demonstrate the benefits of POEM over other meta-learning methods at representation learning from partial observations. We further demonstrate the utility of POEM by meta-learning to represent an environment from partial views observed by an agent exploring the environment. 1

Minimise Distance

Maximise Consistency This encourages the learning of features that are consistent in all views; in the above example this corresponds to the pattern on the bird's wing. To better handle partial observability, where features may be disjoint between views, we propose Partial Observation Experts Modelling (POEM). POEM instead maximises consistency between multiple views, by utilising representation uncertainty to learn which features of the entity are captured by a view, and then combining these representations together by weighting features by their uncertainty via a product of experts model (Hinton, 2002) .

1. INTRODUCTION

Modern contrastive learning methods (Radford et al., 2021; Chen et al., 2020; He et al., 2020; Oord et al., 2019) , and embedding-based meta-learning methods such as Prototypical Networks (Snell et al., 2017; Vinyals et al., 2016; Sung et al., 2018; Edwards & Storkey, 2017) , learn representations by minimizing a relative distance between representations of related items compared with unrelated



Implementation code is available at https://github.com/AdamJelley/POEM 1



Figure 1: Standard contrastive (meta-) learners minimise a relative distance between representations.This encourages the learning of features that are consistent in all views; in the above example this corresponds to the pattern on the bird's wing. To better handle partial observability, where features may be disjoint between views, we propose Partial Observation Experts Modelling (POEM). POEM instead maximises consistency between multiple views, by utilising representation uncertainty to learn which features of the entity are captured by a view, and then combining these representations together by weighting features by their uncertainty via a product of experts model(Hinton, 2002).

