MULTISCALE NEURAL OPERATOR: LEARNING FAST AND GRID-INDEPENDENT PDE SOLVERS

Abstract

Numerical simulations in climate, chemistry, or astrophysics are computationally too expensive for uncertainty quantification or parameter-exploration at highresolution. Reduced-order or surrogate models are multiple orders of magnitude faster, but traditional surrogates are inflexible or inaccurate and pure machine learning (ML)-based surrogates too data-hungry. We propose a hybrid, flexible surrogate model that exploits known physics for simulating large-scale dynamics and limits learning to the hard-to-model term, which is called parametrization or closure and captures the effect of fine-onto large-scale dynamics. Leveraging neural operators, we are the first to learn grid-independent, non-local, and flexible parametrizations. Our multiscale neural operator is motivated by a rich literature in multiscale modeling, has quasilinear runtime complexity, is more accurate or flexible than state-of-the-art parametrizations and demonstrated on the chaotic equation multiscale Lorenz96.

1. INTRODUCTION

Climate change increases the likelihood of storms, floods, wildfires, heat waves, biodiversity loss and air pollution (IPCC, 2018) . Decision-makers rely on climate models to understand and plan for changes in climate, but current climate models are computationally too expensive: as a result, they are hard to access, cannot predict local changes (< 10km), fail to resolve local extremes (e.g., rainfall), and do not reliably quantify uncertainties (Palmer et al., 2019) . For example, running a global climate model at 1km resolution can take ten days on a 4888×GPU node supercomputer, consuming the same electricity as a coal power plants generates in one hour (Fuhrer et al., 2018) . Similarly, in molecular dynamics (Batzner et al., 2022 ), chemistry (Behler, 2011 ), biology (Yazdani et al., 2020 ), energy (Zhang et al., 2019) , astrophysics or fluids (Duraisamy et al., 2019) , scientific progress is hindered by the computational cost of solving partial differential equations (PDEs) at high-resolution (Karniadakis et al., 2021) . We are proposing the first PDE surrogate that quickly computes approximate solutions via correcting known large-scale simulations with learned, gridindependent, non-local parametrizations. Explicitly modeling all scales of Earth's weather is too expensive for traditional and learning-based solvers (Palmer et al., 2019) . MNO dramatically reduces the computational cost by modeling the large-scale explicitly and learning the effect of fine-onto large-scale dynamics; such as turbulence slowing down a river stream. We embed a grid-independent neural operator in the large-scale physical simulations as a "parametrization", conceptually similar to stacking dolls (Snagglebit, 2022) . Surrogate models are fast, reduced-order, and lightweight copies of numerical simulations (Quarteroni & Rozza, 2014) and of significant interest in physics-informed machine learning (Kashinath et al., 2021; Reichstein et al., 2019; Karpatne et al., 2019; Ganguly et al., 2014) . Machine learning (ML)-based surrogates have simulated PDEs up to 1 -3 order of magnitude faster than traditional numerical solvers and are more flexible and accurate than traditional surrogate models (Karniadakis et al., 2021) . However, pure ML-based surrogates are too data-hungry (Rasp et al., 2020) ; so, hybrid ML-physics models are created, for example, via incorporating known symmetries (Bronstein et al., 2021; Batzner et al., 2022) or equations (Willard et al., 2022) . Most hybrid models represent the solution at the highest possible resolution which becomes computationally infeasible in multiscale or very high-resolution physics; even at optimal runtime (Pavliotis & Stuart, 2008; Peng et al., 2021) . As depicted in Figs. 1 and 2 , we simulate multiscale physics by running easy-to-acces large-scale models and focusing learning on the challenging task: How can we model the influence of fineonto large-scale dynamics, i.e., what is the subgrid parametrization term? The lack of accuracy in current subgrid parametrizations, also called closure or residual terms, is one of the major sources of uncertainty in multiscale systems, such as turbulence or climate (Palmer et al., 2019; Gentine et al., 2018) . Learning subgrid parametrizations can be combined with incorporating equations as soft (Raissi et al., 2019) or hard (Beucler et al., 2021a) constraints. Various works learn subgrid parametrizations, but are either inaccurate, hard to share or inflexible because they are local (Gentine et al., 2018 ), grid-dependent (Lapeyre et al., 2019 ), or domain-specific (Behler J, 2007) , respectively as detailed in Section 2. We are the first to formulate the parametrization problem as learning neural operators (Anandkumar et al., 2020) to represent non-local, flexible, and grid-independent parametrizations. We propose, multiscale neural operator (MNO), a novel learning-based PDE surrogate for multiscale physics with the key contributions: • A learning-based multiscale PDE surrogate that has quasilinear runtime complexity, leverages known large-scale physics, is grid-independent, flexible, and does not require autodifferentiable solvers. • The first surrogate to approximate grid-independent, non-local parametrizations via neural operators. • Demonstration of the surrogate on the chaotic, coupled, multiscale PDE: multiscale Lorenz96.

2. RELATED WORKS

We embed our work in the broader field of physics-informed machine learning and surrogate modeling. We propose the first surrogate that corrects a coarse-grained simulation via learned, gridindependent, non-local parameterizations. Direct numerical simulation. Despite significant progress in simulating physics numerically it remains prohibitively expensive to repeatedly solve high-dimensional partial differential equations (PDEs) (Karniadakis et al., 2021) . For example, finite difference, element, volume, and (pseudo-) spectral methods have to be re-run for every choice of initial or boundary condition, grid, or parameters (Farlow, 1993; Boyd, 2013) . The issue arises if the chosen method does not have optimal runtime, i.e., does not scale linearly with the number of grid points, which renders it infeasibly expensive for calculating ensembles (Boyd, 2013) . Select methods have optimal or close-to-optimal runtime, e.g., quasi-linear O(N log N ), and outperform machine learning-based methods in runtime and accuracy, but their implementation often requires significant problem-specific adaptations; for example multigrid (Briggs et al., 2000) or spectral methods (Boyd, 2013) . We acknowledge the existence of impressive resarch directions towards optimal and flexible non-ML solvers, such as the spectral solver called Dedalus (Burns et al., 2020) , but advocate to simultaneously explore easy-toadapt ML methods to create fast, accurate, and flexible surrogate models. Surrogate modeling. Surrogate models are approximations, lightweight copies, or reducedorder models of PDE solutions, often fit to data, and used for parameter exploration or uncertainty quantificiation (Smith, 2013; Quarteroni & Rozza, 2014) . Surrogate models via SVD/POD (Chatterjee, 2000) , Eigendecompositions/KLE (Fukunaga & Koontz, 1970) , Koopman



Figure1: Multiscale neural operator (MNO). Explicitly modeling all scales of Earth's weather is too expensive for traditional and learning-based solvers(Palmer et al., 2019). MNO dramatically reduces the computational cost by modeling the large-scale explicitly and learning the effect of fine-onto large-scale dynamics; such as turbulence slowing down a river stream. We embed a grid-independent neural operator in the large-scale physical simulations as a "parametrization", conceptually similar to stacking dolls (Snagglebit, 2022).

