NEURAL TIME-DEPENDENT PARTIAL DIFFERENTIAL EQUATION

Abstract

Partial differential equations (PDEs) play a crucial role in studying a vast number of problems in science and engineering. Numerically solving nonlinear and/or highdimensional PDEs is frequently a challenging task. Inspired by the traditional finite difference and finite elements methods and emerging advancements in machine learning, we propose a sequence-to-sequence learning (Seq2Seq) framework called Neural-PDE, which allows one to automatically learn governing rules of any timedependent PDE system from existing data by using a bidirectional LSTM encoder, and predict the solutions in next n time steps. One critical feature of our proposed framework is that the Neural-PDE is able to simultaneously learn and simulate all variables of interest in a PDE system. We test the Neural-PDE by a range of examples, from one-dimensional PDEs to a multi-dimensional and nonlinear complex fluids model. The results show that the Neural-PDE is capable of learning the initial conditions, boundary conditions and differential operators defining the initial-boundary-value problem of a PDE system without the knowledge of the specific form of the PDE system. In our experiments, the Neural-PDE can efficiently extract the dynamics within 20 epochs training and produce accurate predictions. Furthermore, unlike the traditional machine learning approaches for learning PDEs, such as CNN and MLP, which require great quantity of parameters for model precision, the Neural-PDE shares parameters among all time steps, and thus considerably reduces computational complexity and leads to a fast learning algorithm.

1. INTRODUCTION

The research of time-dependent partial differential equations (PDEs) is regarded as one of the most important disciplines in applied mathematics. PDEs appear ubiquitously in a broad spectrum of fields including physics, biology, chemistry, and finance, to name a few. Despite their fundamental importance, most PDEs can not be solved analytically and have to rely on numerical solving methods. Developing efficient and accurate numerical schemes for solving PDEs, therefore, has been an active research area over the past few decades (Courant et al., 1967; Osher & Sethian, 1988; LeVeque; Cockburn et al., 2012; Thomas, 2013; Johnson, 2012) . Still, devising stable and accurate schemes with acceptable computational cost is a difficult task, especially when nonlinear and(or) high-dimensional PDEs are considered. Additionally, PDE models emerged from science and engineering disciplines usually require huge empirical data for model calibration and validation, and determining the multidimensional parameters in such a PDE system poses another challenge (Peng et al., 2020) . Deep learning is considered to be the state-of-the-art tool in classification and prediction of nonlinear inputs, such as image, text, and speech (Litjens et al., 2017; Devlin et al., 2018; LeCun et al., 1998; Krizhevsky et al., 2012; Hinton et al., 2012) . Recently, considerable efforts have been made to employ deep learning tools in designing data-driven methods for solving PDEs (Han et al., 2018; Long et al., 2018; Sirignano & Spiliopoulos, 2018; Raissi et al., 2019) . Most of these approaches are based on fully-connected neural networks (FCNNs), convolutional neural networks(CNNs) and multilayer perceptron (MLP). These neural network structures usually require an increment of the layers to improve the predictive accuracy (Raissi et al., 2019) , and subsequently lead to a more complicated model due to the additional parameters. Recurrent neural networks (RNNs) are one type of neural network architectures. RNNs predict the next time step value by using the input data from the current and previous states and share parameters across all inputs. This idea (Sherstinsky, 2020) of using current and previous step states to calculate the state at the next time step is not unique to RNNs. In fact, it is ubiquitously used in numerical PDEs. Almost all time-stepping numerical methods applied to solve time-dependent PDEs, such as Euler's, Crank-Nicolson, high-order Taylor and its variance Runge-Kutta (Ascher et al., 1997) time-stepping methods, update numerical solution by utilizing solution from previous steps. This motivates us to think what would happen if we replace the previous step data in the neural network with numerical solution data to PDE supported on grids. It is possible that the neural network behaves like a time-stepping method, for example, forward Euler's method yields the numerical solution at a new time point as the current state output (Chen et al., 2018) . Since the numerical solution on each of the grid point (for finite difference) or grid cell (for finite element) computed at a set of contiguous time points can be treated as neural network input in the form of one time sequence of data, the deep learning framework can be trained to predict any time-dependent PDEs from the time series data supported on some grids if the bidirectional structure is applied (Huang et al., 2015; Schuster & Paliwal, 1997) . In other words, the supervised training process can be regarded as a practice of the deep learning framework to learn the numerical solution from the input data, by learning the coefficients on neural network layers. Long Short-Term Memory (LSTM) (Hochreiter & Schmidhuber, 1997) is a neural network built upon RNNs. Unlike vanilla RNNs, which suffer from losing long term information and high probability of gradient vanishing or exploding, LSTM has a specifically designed memory cell with a set of new gates such as input gate and forget gate. Equipped with these new gates which control the time to preserve and pass the information, LSTM is capable of learning long term dependencies without the danger of having gradient vanishing or exploding. In the past two decades, LSTM has been widely used in the field of natural language processing (NLP), such as machine translation, dialogue systems, question answering systems (Lipton et al., 2015) . Inspired by numerical PDE schemes and LSTM neural network, we propose a new deep learning framework, denoted as Neural-PDE. It simulates multi-dimensional governing laws, represented by time-dependent PDEs, from time series data generated on some grids and predicts the next n time steps data. The Neural-PDE is capable of intelligently processing related data from all spatial grids by using the bidirectional (Schuster & Paliwal, 1997) neural network, and thus guarantees the accuracy of the numerical solution and the feasibility in learning any time-dependent PDEs. The detailed structures of the Neural-PDE and data normalization are introduced in Section 3. The rest of the paper is organized as follows. Section 2 briefly reviews finite difference method for solving PDEs. Section 3 contains detailed description of designing the Neural-PDE. In Section 4 and Appendix A of the paper, we apply the Neural-PDE to solve four different PDEs, including the 1-dimensional(1D) wave equation, the 2-dimensional(2D) heat equation, and two systems of PDEs: the invicid Burgers' equations and a coupled Navier Stokes-Cahn Hilliard equations, which widely appear in multiscale modeling of complex fluid systems. We demonstrate the robustness of the Neural-PDE, which achieves convergence within 20 epochs with an admissible mean squared error, even when we add Gaussian noise in the input data.

2. PRELIMINARIES 2.1 TIME DEPENDENT PARTIAL DIFFERENTIAL EQUATIONS

A time-dependent partial differential equation is an equation of the form: u t = f (x 1 , • • • , u, ∂u ∂x 1 , • • • , ∂u ∂x n , ∂ 2 u ∂x 1 ∂x 1 , • • • , ∂ 2 u ∂x 1 ∂x n , • • • , ∂ n u ∂x 1 • • • ∂x n ) , (2.1.1) where u = u(t, x 1 , ..., x n ) is known, x i ∈ R are spatial variables, and the operator f maps R → R. For example, consider the parabolic heat equation: u t = α 2 ∆u, where u represents the temperature and f is the Laplacian operator ∆. Eq. (2.1.1) can be solved by finite difference methods, which is briefly reviewed below for the self-completeness of the paper.

