HYPERDEEPONET: LEARNING OPERATOR WITH COM-PLEX TARGET FUNCTION SPACE USING THE LIMITED RESOURCES VIA HYPERNETWORK

Abstract

Fast and accurate predictions for complex physical dynamics are a significant challenge across various applications. Real-time prediction on resource-constrained hardware is even more crucial in real-world problems. The deep operator network (DeepONet) has recently been proposed as a framework for learning nonlinear mappings between function spaces. However, the DeepONet requires many parameters and has a high computational cost when learning operators, particularly those with complex (discontinuous or non-smooth) target functions. This study proposes HyperDeepONet, which uses the expressive power of the hypernetwork to enable the learning of a complex operator with a smaller set of parameters. The DeepONet and its variant models can be thought of as a method of injecting the input function information into the target function. From this perspective, these models can be viewed as a particular case of HyperDeepONet. We analyze the complexity of DeepONet and conclude that HyperDeepONet needs relatively lower complexity to obtain the desired accuracy for operator learning. HyperDeepONet successfully learned various operators with fewer computational resources compared to other benchmarks.

1. INTRODUCTION

Operator learning for mapping between infinite-dimensional function spaces is a challenging problem. It has been used in many applications, such as climate prediction (Kurth et al., 2022) and fluid dynamics (Guo et al., 2016) . The computational efficiency of learning the mapping is still important in real-world problems. The target function of the operator can be discontinuous or sharp for complicated dynamic systems. In this case, balancing model complexity and cost for computational time is a core problem for the real-time prediction on resource-constrained hardware (Choudhary et al., 2020; Murshed et al., 2021) . Many machine learning methods and deep learning-based architectures have been successfully developed to learn a nonlinear mapping from an infinite-dimensional Banach space to another. They focus on learning the solution operator of some partial differential equations (PDEs), e.g., the initial or boundary condition of PDE to the corresponding solution. Anandkumar et al. (2019) proposed an iterative neural operator scheme to learn the solution operator of PDEs. Simultaneously, Lu et al. (2019; 2021) proposed a deep operator network (DeepONet) architecture based on the universal operator approximation theorem of Chen & Chen (1995). The DeepONet consists of two networks: branch net taking an input function at fixed finite locations, and trunk net taking a query location of the output function domain. Each network provides the p outputs. The two p-outputs are combined as a linear combination (inner-product) to approximate the underlying operator, where the branch net produces the coefficients (p-coefficients) and the trunk net produces the basis functions (p-basis) of the target function.

