NEURAL NETWORK APPROXIMATIONS OF PDES BE-YOND LINEARITY: REPRESENTATIONAL PERSPECTIVE

Abstract

A burgeoning line of research has developed deep neural networks capable of approximating the solutions to high dimensional PDEs, opening related lines of theoretical inquiry focused on explaining how it is that these models appear to evade the curse of dimensionality. However, most theoretical analyses thus far have been limited to simple linear PDEs. In this work, we take a step towards studying the representational power of neural networks for approximating solutions to nonlinear PDEs. We focus on a class of PDEs known as nonlinear variational elliptic PDEs, whose solutions minimize an Euler-Lagrange energy functional E(u) = Ω L(∇u)dx. We show that if composing a function with Barron norm b with L produces a function of Barron norm at most B L b p , the solution to the PDE can be ϵ-approximated in the L 2 sense by a function with Barron norm O (dB L ) p log (1/ϵ) 



. By a classical result due to Barron (1993) , this correspondingly bounds the size of a 2-layer neural network needed to approximate the solution. Treating p, ϵ, B L as constants, this quantity is polynomial in dimension, thus showing that neural networks can evade the curse of dimensionality. Our proof technique involves "neurally simulating" (preconditioned) gradient in an appropriate Hilbert space, which converges exponentially fast to the solution of the PDE, and such that we can bound the increase of the Barron norm at each iterate. Our results subsume and substantially generalize analogous prior results for linear elliptic PDEs.

1. INTRODUCTION

Scientific applications have become one of the new frontiers for the application of deep learning (Jumper et al., 2021; Tunyasuvunakool et al., 2021; Sønderby et al., 2020) . PDEs are one of the fundamental modeling techniques in scientific domains, and designing neural network-aided solvers, particularly in high-dimensions, is of widespread usage in many domains (Hsieh et al., 2019; Brandstetter et al., 2022) . One of the most common approaches for applying neural networks to solve PDEs is to parameterize the solution as a neural network and minimize a loss which characterizes the solution (Sirignano & Spiliopoulos, 2018; E & Yu, 2017) . The hope in doing so is to have a method which computationally avoids the "curse of dimensionality"-i.e., that scales less than exponentially with the ambient dimension. To date, neither theoretical analysis nor empirical applications have yielded a precise characterization of the range of PDEs for which neural network-aided methods outperform classical methods. Active research on the empirical side (Han et al., 2018; E et al., 2017; Li et al., 2020a; b) has explored several families of PDEs, e.g., Hamilton-Bellman-Jacobi and Black-Scholes, where neural networks have been demonstrated to outperform classical grid-based methods. On the theory side, a recent line of works (Marwah et al., 2021; Chen et al., 2021; 2022) 

has considered the following fundamental question:

For what families of PDEs can the solution be represented by a small neural network? The motivation for this question is computational: since the computational complexity of fitting a neural network (by minimizing some objective) will grow with its size. Specifically, these works focus on understanding when the approximating neural network can be sub-exponential in size, thus

