CONTEXT-AGNOSTIC LEARNING USING SYNTHETIC DATA

Abstract

We propose a novel setting for learning, where the input domain is the image of a map defined on the product of two sets, one of which completely determines the labels. Given the ability to sample from each set independently, we present an algorithm that learns a classifier over the input domain more efficiently than sampling from the input domain directly. We apply this setting to visual classification tasks, where our approach enables us to train classifiers on datasets that consist entirely of a single example of each class. On several standard benchmarks for real-world image classification, our approach achieves performance competitive with state-of-the-art results from the few-shot learning and domain transfer literature, while using significantly less data.

1. INTRODUCTION

Despite recent advances in deep learning, one central challenge is the large amount of labelled training data required to achieve state-of-the-art performance. Procuring such volumes of high quality, reliably annotated data can be costly or even close to impossible (e.g., obtaining data to train an autonomous navigation system for a lunar probe). Additional hurdles include hidden biases in large datasets (Tommasi et al., 2017) and maliciously perturbed training data (Biggio et al., 2012) . Synthetically generated data has seen growing adoption in response to these problems, since the marginal cost of producing new training data is generally very low, and one has full control over the generation process. This is particularly true for applications with a physical component, such as autonomous navigation (Gaidon et al., 2016) or robotics (Todorov et al., 2012) . However, training with purely synthetic data suffers from the so-called "reality gap", whereby good performance on synthetic data does not necessarily yield good performance in the real world (Jakobi et al., 1995) . In particular, the difficulty of generating realistic training images scales not just with the objects of interest, but also the real-world contexts in which the learned model is expected to operate. This work begins with the simple observation that, for many classification tasks, the label of an input is determined entirely by the object; however, this additional structure is discarded by current synthetic data pipelines. Our goal is to leverage this decomposition to develop more efficient methods for the related problems of generating training data and learning from a synthetic domain. Our contributions are two-fold: first, we formally introduce the setting of context-agnostic learning, where the input space is decomposed into object and context spaces, and the labels are independent of contexts when conditioned on the objects. Second, we propose an algorithm to efficiently train a classifier in the context-agnostic setting, which relies on the ability to sample from the object and context spaces independently. We apply our methods to train deep neural networks for real-world image classification using only a single synthetic example of each class, obtaining performance comparable to existing methods for domain adaptation and few-shot learning while using substantially less data. Our results show that it is possible to train classifiers in the absence of any contextual training data that nonetheless generalize to real world domains.



RELATED WORK Domain shift refers to the problem that occurs when the training set (source domain) and test set (target domain) are drawn from different distributions. In this setting, a classifier which performs

