BI-LEVEL PHYSICS-INFORMED NEURAL NETWORKS FOR PDE CONSTRAINED OPTIMIZATION USING BROYDEN'S HYPERGRADIENTS

Abstract

Deep learning based approaches like Physics-informed neural networks (PINNs) and DeepONets have shown promise on solving PDE constrained optimization (PDECO) problems. However, existing methods are insufficient to handle those PDE constraints that have a complicated or nonlinear dependency on optimization targets. In this paper, we present a novel bi-level optimization framework to resolve the challenge by decoupling the optimization of the targets and constraints. For the inner loop optimization, we adopt PINNs to solve the PDE constraints only. For the outer loop, we design a novel method by using Broyden's method based on the Implicit Function Theorem (IFT), which is efficient and accurate for approximating hypergradients. We further present theoretical explanations and error analysis of the hypergradients computation. Extensive experiments on multiple large-scale and nonlinear PDE constrained optimization problems demonstrate that our method achieves state-of-the-art results compared with strong baselines.

1. INTRODUCTION

PDE constrained optimization (PDECO) aims at optimizing the performance of a physical system constrained by partial differential equations (PDEs) with desired properties. It is a fundamental task in numerous areas of science (Chakrabarty & Hanson, 2005; Ng & Dubljevic, 2012) and engineering (Hicks & Henne, 1978; Chen et al., 2009) , with a wide range of important applications including image denoising in computer vision (De los Reyes & Schönlieb, 2013) , design of aircraft wings in aerodynamics (Hicks & Henne, 1978) , and drug delivery (Chakrabarty & Hanson, 2005) in biology etc. These problem have numerous inherent challenges due to the diversity and complexity of physical constraints and practical problems. Traditional numerical methods like adjoint methods (Herzog & Kunisch, 2010) based on finite element methods (FEMs) (Zienkiewicz et al., 2005) have been studied for decades. They could be divided into continuous and discretized adjoint methods (Mitusch et al., 2019) . The former one requires complex handcraft derivation of adjoint PDEs and the latter one is more flexible and more frequently used. However, the computational cost of FEMs grows quadratically to cubically (Xue et al., 2020) w.r.t mesh sizes. Thus compared with other constrained optimization problems, it is much more expensive or even intractable to solve high dimensional PDECO problems with a large search space or mesh size. To mitigate this problem, neural network methods like DeepONet (Lu et al., 2019) have been proposed as surrogate models of FEMs recently. DeepONet learns a mapping from control (decision) variables to solutions of PDEs and further replaces PDE constraints with the operator network. But these methods require pretraining a large operator network which is non-trivial and inefficient. Moreover, its performance may deteriorate if the optimal solution is out of the training distribution (Lanthaler et al., 2022) . Another approach of neural methods (Lu et al., 2021; Mowlavi & Nabi, 2021) proposes to use a single PINN (Raissi et al., 2019) to solve the PDECO problem instead of pretraining an operator network. It uses the method of Lagrangian multipliers to treat the PDE constraints as regularization terms, and thus optimize the objective and PDE loss simultaneously. However, such methods introduce a trade-off between optimization targets and regularization terms (i.e., PDE losses) which is crucial for the performance (Nandwani et al., 2019) . It is generally non-trivial to set proper weights for balancing these terms due to the lack of theoretical guidance. Existing heuristic approaches for selecting the weights may usually yield an unstable training process. Therefore, it is imperative to develop an effective strategy to handle PDE constraints for solving PDECO problems. To address the aforementioned challenges, we propose a novel bi-level optimization framework named Bi-level Physics-informed Neural networks with Broyden's hypergradients (BPN) for solving PDE constrained optimization problems. Specifically, we first present a bi-level formulation of the PDECO problems, which decouples the optimization of the targets and PDE constraints, thereby naturally addressing the challenge of loss balancing in regularization based methods. To solve the bi-level optimization problem, we develop an iterative method that optimizes PINNs with PDE constraints in the inner loop while optimizes the control variables for objective functions in the outer loop using hypergradients. In general, it is nontrivial to compute hypergradients in bi-level optimization for control variables especially if the inner loop optimization is complicated (Lorraine et al., 2020) . To address this issue, we further propose a novel strategy based on implicit differentiation using Broyden's method which is a scalable and efficient quasi-Newton method in practice (Kelley, 1995; Bai et al., 2020) . We then theoretically prove an upper bound for the approximation of hypergradients under mild assumptions. Extensive experiments on several benchmark PDE constrained optimization problems show that our method is more effective and efficient compared with the alternative methods. We summarize our contributions as follows: • To the best of our knowledge, it is the first attempt that solves general PDECO problems based on deep learning using a bi-level optimization framework that enjoys scalability and theoretical guarantee. • We propose a novel and efficient method for hypergradients computation using Broyden's method to solve the bi-level optimization. • We conduct extensive experiments and achieve state-of-the-art results among deep learning methods on several challenging PDECO problems with complex geometry or non-linear Naiver-Stokes equations.

2. RELATED WORK

Neural Networks Approaches for PDE Constrained Optimization. Surrogate modeling is an important class of methods for PDE constrained optimization (Queipo et al., 2005) . Physics-informed neural networks (PINNs) are powerful and flexible surrogates to represent the solutions of PDEs (Raissi et al., 2019) . hPINN (Lu et al., 2021) treats PDE constraints as regularization terms and optimizes the control variables and states simultaneously. It uses the penalty method and the Lagrangian method to adjust the weights of multipliers. (Mowlavi & Nabi, 2021) also adopts the same formulation but uses a line search to find the largest weight when the PDE error is within a certain range. The key limitation of these approaches is that heuristically choosing methods for tuning weights of multipliers might be sub-optimal and unstable. Another class of methods train an operator network from control variables to solutions of PDEs or objective functions. Several works (Xue et al., 2020; Sun et al., 2021; Beatson et al., 2020) use mesh-based methods and predict states on all mesh points from control variables at the same time. PI-DeepONet (Wang et al., 2021a; c) adopts the architecture of DeepONet (Lu et al., 2019) and trains the network using physics-informed losses (PDE losses). However, they produce unsatisfactory results if the optimal solution is out of the distribution (Lanthaler et al., 2022) . Bi-level Optimization in Machine Learning. Bi-level optimization is widely used in various machine learning tasks, e.g., neural architecture search (Liu et al., 2018) , meta learning (Rajeswaran et al., 2019) and hyperparameters optimization (Lorraine et al., 2020; Bao et al., 2021) . One of the key challenges is to compute hypergradients with respect to the inner loop optimization (Liu et al., 2021) . Some previous works (Maclaurin et al., 2015; Liu et al., 2018) use unrolled optimization or

