LEARNING DIFFERENTIABLE SOLVERS FOR SYSTEMS WITH HARD CONSTRAINTS

Abstract

We introduce a practical method to enforce partial differential equation (PDE) constraints for functions defined by neural networks (NNs), with a high degree of accuracy and up to a desired tolerance. We develop a differentiable PDEconstrained layer that can be incorporated into any NN architecture. Our method leverages differentiable optimization and the implicit function theorem to effectively enforce physical constraints. Inspired by dictionary learning, our model learns a family of functions, each of which defines a mapping from PDE parameters to PDE solutions. At inference time, the model finds an optimal linear combination of the functions in the learned family by solving a PDE-constrained optimization problem. Our method provides continuous solutions over the domain of interest that accurately satisfy desired physical constraints. Our results show that incorporating hard constraints directly into the NN architecture achieves much lower test error when compared to training on an unconstrained objective.

1. INTRODUCTION

Methods based on neural networks (NNs) have shown promise in recent years for physics-based problems (Raissi et al., 2019; Li et al., 2020; Lu et al., 2021a; Li et al., 2021) . Consider a parameterized partial differential equation (PDE), F φ (u) = 0. F φ is a differential operator, and the PDE parameters φ and solution u are functions over a domain X . Let Φ be a distribution of PDEparameter functions φ. The goal is to solve the following feasibility problem by training a NN with parameters θ ∈ R p , i.e., find θ such that, for all functions φ sampled from Φ, the NN solves the feasibility problem, F φ (u θ (φ)) = 0. Training such a model requires solving highly nonlinear feasibility problems in the NN parameter space, even when F φ describes a linear PDE. Current NN methods use two main training approaches to solve Equation 1. The first approach is strictly supervised learning, and the NN is trained on PDE solution data using a regression loss (Lu et al., 2021a; Li et al., 2020) . In this case, the feasibility problem only appears through the data; it does not appear explicitly in the training algorithm. The second approach (Raissi et al., 2019) aims to solve the feasibility problem in Equation 1 by considering the relaxation, min θ E φ∼Φ F φ (u θ (φ)) 2 2 . (2) This second approach does not require access to any PDE solution data. These two approaches have also been combined by having both a data fitting loss and the PDE residual loss (Li et al., 2021) . However, both of these approaches come with major challenges. The first approach requires potentially large amounts of PDE solution data, which may need to be generated through expensive numerical simulations or experimental procedures. It can also be challenging to generalize outside the training data, as there is no guarantee that the NN model has learned the relevant physics. For the second approach, recent work has highlighted that in the context of scientific modeling, the relaxed feasibility problem in Equation 2 is a difficult optimization problem (Krishnapriyan et al.,

