EVOLVE SMOOTHLY, FIT CONSISTENTLY: LEARN-ING SMOOTH LATENT DYNAMICS FOR ADVECTION-DOMINATED SYSTEMS

Abstract

We present a data-driven, space-time continuous framework to learn surrogate models for complex physical systems described by advection-dominated partial differential equations. Those systems have slow-decaying Kolmogorov n-width that hinders standard methods, including reduced order modeling, from producing high-fidelity simulations at low cost. In this work, we construct hypernetworkbased latent dynamical models directly on the parameter space of a compact representation network. We leverage the expressive power of the network and a specially designed consistency-inducing regularization to obtain latent trajectories that are both low-dimensional and smooth. These properties render our surrogate models highly efficient at inference time. We show the efficacy of our framework by learning models that generate accurate multi-step rollout predictions at much faster inference speed compared to competitors, for several challenging examples.

1. INTRODUCTION

High-fidelity numerical simulation of physical systems modeled by time-dependent partial differential equations (PDEs) has been at the center of many technological advances in the last century. However, for engineering applications such as design, control, optimization, data assimilation, and uncertainty quantification, which require repeated model evaluation over a potentially large number of parameters, or initial conditions, high-fidelity simulations remain prohibitively expensive, even with state-of-art PDE solvers. The necessity of reducing the overall cost for such downstream applications has led to the development of surrogate models, which captures the core behavior of the target system but at a fraction of the cost. One of the most popular frameworks in the last decades (Aubry et al., 1988) to build such surrogates has been reduced order models (ROMs). In a nutshell, they construct lower-dimensional representations and their corresponding reduced dynamics that capture the system's behavior of interest. The computational gains then stem from the evolution of a lower-dimensional latent representation (see Benner et al. (2015) for a comprehensive review). However, classical ROM techniques often prove inadequate for advection-dominated systems, whose trajectories do not admit fast decaying Kolmogorov n-width (Pinkus, 2012), i.e. there does not exist a well-approximating n-dimensional subspace with low n. This hinders projection-based ROM approaches from simultaneously achieving high accuracy and efficiency (Peherstorfer, 2020). Furthermore, many classical ROM methods require exact and complete knowledge of the underlying PDEs. However, that requirement is unrealistic in most real-world applications. For example, in weather and climate modeling, many physical processes are unknown or unresolved and thus are approximated. In other cases, the form of the PDE is unknown. Hence, being able to learn ROMs directly from data is highly desirable. In particular, modern learning techniques offer the opportunities of using "black-box" neural networks to parameterize the projections between the original space and the low-dimensional space as well as the dynamics in the low-dimensional space (Sanchez-Gonzalez et al., 2020; Chen et al., 2021; Kochkov et al., 2020) . On the other end, neural parameterization can be too flexible such that they are not necessarily grounded in physics. Thus, they do not extrapolate well to unseen conditions, or in the case of modeling dynamical systems, they do not evolve very far from their initial point before the trajectories diverge in a nonphysical fashion. How can we inject into neural models the right types and amount of physical constraints? In this paper, we explore a hypernetwork-type surrogate model, leveraging contemporary machine learning techniques but constrained with physics-informed priors: (a) Time-Space Continuity: the trajectories of the model are continuous in time and space; (b) Causality: the present state of the model depends explicitly in the state of the model in the past; (c) Mesh-Agnostic: the trajectories of the model should be independent of the discretization both in space and time, so it can be sampled at any given spacial-time point (d) Latent-Space Smoothness if the trajectories of the model evolve smoothly, the same should be true for the latent-space trajectories. The first three properties are enforced explicitly by choices of the architectures, whereas the last one is enforced implicitly at training via a consistency regularization. Concretely, we leverage the expressive power of neural networks (Hornik et al., 1990) to represent the state of the systems through an ansatz (decoder) network that takes cues from pseudo-spectral methods (Boyd, 2001), and Neural Radiance Fields (Mildenhall et al., 2020) , whose weights encode the latent space, and are computed by an encoder-hypernetwork. This hypernetwork encoder-decoder combination is trained on trajectories of the system, which together with a consistency regularization provides smooth latent-space trajectories. The smoothness of the trajectories renders the learning of the latent-space dynamics effortless using a neural ODE (Chen et al., 2018) model whose dynamics is governed by a multi-scale network similar to a U-Net (Ronneberger et al., 2015) that extracts weight, layer and graph level features important for computing the system dynamics directly. This allows us to evolve the system completely in latent-space, only decoding to ambient space when required by the downstream application (such process is depicted in Fig. 1 left ). In addition, due to the smooth nature of the latent trajectories, we can take very large time-steps when evolving the system, thus providing remarkable computational gains, particularly compared to competing methods that do not regularize the smoothness of the latent space (see Fig. 1 right for an illustration). The proposed framework is nonlinear and applicable to a wide range of systems, although in this paper we specifically focus on advection-dominated systems to demonstrate its effectiveness. The rest of the paper is organized as follows. We briefly review existing work in solving PDEs in §2. We describe our methodology in §3, focusing on how to use consistency loss to learn smooth latent dynamics. We contrast the proposed methodology to existing methods on several benchmark systems in §4 and conclude in §5.



Figure 1: (left) Diagram of the latent space evolution, where u represents the state variables, and θ the latent space variables, which in this case correspond to the weights of a neural network. (right) Sketch of two possible latent space trajectories, we seek to implicitly regularize the encode-decoder to obtain trajectories as the second one, which allows us to take very long time-steps in latent space.

