PHYSICS INFORMED DEEP KERNEL LEARNING

Abstract

Deep kernel learning is a promising combination of deep neural networks and nonparametric function estimation. However, as a data driven approach, the performance of deep kernel learning can still be restricted by scarce or insufficient data, especially in extrapolation tasks. To address these limitations, we propose Physics Informed Deep Kernel Learning (PI-DKL) that exploits physics knowledge represented by differential equations with latent sources. Specifically, we use the posterior function sample of the Gaussian process as the surrogate for the solution of the differential equation, and construct a generative component to integrate the equation in a principled Bayesian hybrid framework. For efficient and effective inference, we marginalize out the latent variables in the joint probability and derive a simple model evidence lower bound (ELBO), based on which we develop a stochastic collapsed inference algorithm. Our ELBO can be viewed as a nice, interpretable posterior regularization objective. On synthetic datasets and real-world applications, we show the advantage of our approach in both prediction accuracy and uncertainty quantification.

1. Introduction

Deep kernel learning (Wilson et al., 2016a) uses deep neural networks to construct kernels for nonparametric function estimation (e.g., Gaussian processes (Williams and Rasmussen, 2006) ) and unifies both the expressive power of neural networks and self-adaptation of nonparametric function learning. Many applications have shown that deep kernel learning substantially outperforms the conventional shallow kernel learning (e.g., RBF). Compared to standard neural networks, deep kernel learning enjoys closed-form posterior distributions and hence is more convenient for uncertainty quantification and reasoning, which is important for decision making. Nonetheless, as a data driven approach, the performance of deep kernel learning can still be restricted by scarce data, especially when the training samples are insufficient to reflect the complexity of the system (that produced the data) or the test points are far away from the training set, i.e., extrapolation. On the other hand, physics knowledge, expressed as differential equations, are used to build physical models for various science and engineering applications (Lapidus and Pinder, 2011) . These models are meant to characterize the underlying mechanism (i.e., physical processes) that drives the system (e.g., how the heat diffuses across the spatial and temporal domains) and are much less restricted by data availability: they can make accurate predictions even without training data, e.g., the landing of Curiosity on Mars and flight of Voyager 1. Therefore, we consider integrating physics knowledge into deep kernel learning to further improve its performance in prediction and uncertainty quantification, especially for scarce data and extrapolation tasks. Our work is enlightened by the recent Physics Informed Neural Networks (PINNs) (Raissi et al., 2019) . However, there are two substantial differences. First, PINNs require the form of the differential equations to be fully specified. We allow the equations to include unknown latent sources (functions), which is of often the case in practice. Second, we integrate the differential equations with a principled Bayesian manner to pursue better calibrated posterior estimations. Specifically, we use the posterior sample of the Gaussian process (GP), which is a random function, as the surrogate of the solution of the differential equation. We then apply the differential operators in the equation to obtain the sample of the latent source (function), for which we assign another GP prior. To ensure the sampling procedure is valid, we use the symmetric property of the Gaussian distribution to sample a set of virtual observations {0}, which is computationally equivalent to placing the GP prior with zero mean function over the latent source. The sampling procedure constitutes a generative component and ties to the original deep kernel model in the Bayesian hybrid framework (Lasserre

