LEARNING WHAT NOT TO MODEL: GAUSSIAN PRO-CESS REGRESSION WITH NEGATIVE CONSTRAINTS

Abstract

We empirically demonstrate that our GP-NC framework performs better than the traditional GP learning and that our framework does not affect the scalability of Gaussian Process regression and helps the model converge faster as the size of the data increases. Gaussian Process (GP) regression fits a curve on a set of datapairs, with each pair consisting of an input point 'x' and its corresponding target regression value 'y(x)' (a positive datapair). But, what if for an input point 'x', we want to constrain the GP to avoid a target regression value 'ȳ(x)' (a negative datapair)? This requirement can often appear in real-world navigation tasks, where an agent would want to avoid obstacles, like furniture items in a room when planning a trajectory to navigate. In this work, we propose to incorporate such negative constraints in a GP regression framework. Our approach, 'GP-NC' or Gaussian Process with Negative Constraints, fits over the positive datapairs while avoiding the negative datapairs. Specifically, our key idea is to model the negative datapairs using small blobs of Gaussian distribution and maximize its KL divergence from the GP. We jointly optimize the GP-NC for both the positive and negative datapairs. We empirically demonstrate that our GP-NC framework performs better than the traditional GP learning and that our framework does not affect the scalability of Gaussian Process regression and helps the model converge faster as the size of the data increases.

1. INTRODUCTION

Gaussian process are one of the most studied model class for data-driven learning as these are nonparametric, flexible function class that requires little prior knowledge of the process. Traditionally, GPs have found their applications in various fields of research, including Navigation systems (e.g., in Wiener and Kalman filters) (Jazwinski, 2007 ), Geostatistics, Meteorology (Kriging (Handcock & Stein, 1993) ) and Machine learning (Rasmussen, 2006) . The wide range of applications can be attributed to the property of GPs to model the target uncertainty by providing the predictive variance over the target variable. Gaussian process regression in its current construct fits only on a set of positive datapairs, with each pair consisting of an input point and its desired target regression value, to learn the distribution on a functional space. However, in some cases, more information is available in the form of datapairs, where at a particular input point, we want to avoid a range of regression values during the curve fitting of GP. We designate such data as negative datapairs. An illustration where modeling such negative datapairs would be extremely beneficial is given in Fig 1 . In Fig 1(b ), an agent wants to model a trajectory such that it covers all the positive datapairs marked by 'x'. However, it is essential to note that the agent would run into an obstacle if it models its trajectory based only on the positive datapairs. We can handle this problem of navigating in the presence of obstacles in two ways, one way is to get a high density of positive datapairs near the obstacle, and the other more straightforward approach is to just mark the obstacle as a negative datapair. The former approach would unnecessarily increase the number of positive datapairs for GP to regress. Hence, it may run into scalability issues. However, in the latter approach, if the point is denoted as a negative datapair with a sphere of negative influence around it as illustrated by Fig 1 .c , the new trajectory can be modeled with less number of datapairs that accounts for all obstacles on the 1

