A STABLE AND SCALABLE METHOD FOR SOLVING INITIAL VALUE PDES WITH NEURAL NETWORKS

Abstract

Unlike conventional grid and mesh based methods for solving partial differential equations (PDEs), neural networks have the potential to break the curse of dimensionality, providing approximate solutions to problems where using classical solvers is difficult or impossible. While global minimization of the PDE residual over the network parameters works well for boundary value problems, catastrophic forgetting impairs applicability to initial value problems (IVPs). In an alternative local-in-time approach, the optimization problem can be converted into an ordinary differential equation (ODE) on the network parameters and the solution propagated forward in time; however, we demonstrate that current methods based on this approach suffer from two key issues. First, following the ODE produces an uncontrolled growth in the conditioning of the problem, ultimately leading to unacceptably large numerical errors. Second, as the ODE methods scale cubically with the number of model parameters, they are restricted to small neural networks, significantly limiting their ability to represent intricate PDE initial conditions and solutions. Building on these insights, we develop Neural-IVP, an ODE based IVP solver which prevents the network from getting ill-conditioned and runs in time linear in the number of parameters, enabling us to evolve the dynamics of challenging PDEs with neural networks.

1. INTRODUCTION

Partial differential equations (PDEs) are needed to describe many phenomena in the natural sciences. PDEs that model complex phenomena cannot be solved analytically and many numerical techniques are used to computer their solutions. Classical techniques such as finite differences rely on grids and provide efficient and accurate solutions when the dimensionality is low (d = 1, 2). Yet, the computational and memory costs of using grids or meshes scales exponentially with the dimension, making it extremely challenging to solve PDEs accurately in more than 3 dimensions. Neural networks have shown considerable success in modeling and reconstructing functions on highdimensional structured data such as images or text, but also for unstructured tabular data and spatial functions. Neural networks sidestep the "curse of dimensionality" by learning representations of the data that enables them to perform efficiently. In this respect, neural networks have similar benefits and drawbacks as Monte Carlo methods. The approximation error ϵ converges at a rate ϵ ∝ 1/ √ n from statistical fluctuations where n is the number of data points or Monte Carlo samples. Expressed inversely, we would need: n ∝ e 2 log 1/ϵ samples to get error ϵ, a compute grows exponentially in the number of significant digits instead of exponential in the dimension as it is for grids. For many problems this tradeoff is favorable and an approximate solution is much better than no solution. Thus, it is natural to consider neural networks for solving PDEs whose dimensionality makes standard approaches intractable. While first investigated in Dissanayake & Phan-Thien (1994) and Lagaris et al. (1998) , recent developments by Yu et al. (2018) and Sirignano & Spiliopoulos (2018) have shown that neural networks can successfully approximate the solution by forcing them to satisfy the dynamics of the PDE on collocation points in the spatio-temporal domain. In particular, the global collocation approaches have proven effective for solving boundary value problems where the neural network can successfully approximate the solution. However, for initial value problems * Equal contribution, order chosen by random coin flip. {maf820, ap6604}@nyu.edu 1

