LEARNING SPECIALIZED ACTIVATION FUNCTIONS FOR PHYSICS-INFORMED NEURAL NETWORKS

Abstract

At the heart of network architectures lie the non-linear activation functions, the choice of which affects the model optimization and task performance. In computer vision and natural language processing, the Rectified Linear Unit is widely adopted across different tasks. However, there is no such default choice of activation functions in the context of physics-informed neural networks (PINNs). It is observed that PINNs exhibit high sensitivity to activation functions due to the various characteristics of each physics system, which makes the choice of the suitable activation function for PINNs a critical issue. Existing works usually choose activation functions in an inefficient trial-and-error manner. To address this problem, we propose to search automatically for the optimal activation function when solving different PDEs. This is achieved by learning an adaptive activation function as linear combinations of a set of candidate functions, whose coefficients can be directly optimized by gradient descent. In addition to its efficient optimization, the proposed method enables the discovery of novel activation function and the incorporation with prior knowledge about the PDE system. We can further enhance its search space with adaptive slope. While being surprisingly simple, the effectiveness of the proposed adaptive activation function is demonstrated on a series of benchmarks, including the Poisson's equation, convection equation, Burgers' equation, Allen-Cahn equation, Korteweg-de Vries equation, and Cahn-Hilliard equation. The performance gain of the proposed method is further interpreted from the neural tangent kernel perspective. Code will be released.

1. INTRODUCTION

Recent years have witnessed the remarkable progress of physics-informed neural networks (PINNs) on simulating the dynamics of physical systems, in which the activation functions play a significant role in the expressiveness and optimization of models. While the Rectified Linear Unit (Hahnloser et al., 2000; Jarrett et al., 2009; Nair & Hinton, 2010) is widely adopted in most computer vision and natural language processing tasks (Ramachandran et al., 2017) , there is no such default choice of activation functions in the context of PINNs. In fact, PINNs show great sensitivity to activation functions when applied to different physical systems, since each system has its own characteristic. On one hand, the utilization of unsuitable activation functions would cause over-parameterization and overfitting. One the other hand, accurate simulations can be achieved with fast convergence and high precision by choosing proper activation functions. For example, the hyperbolic tangent function is shown to suffer from numerical instability when simulating vortex induced vibrations, while a PINN with sinusoidal function can be optimized smoothly (Raissi et al., 2019b) . The various characteristics of different PDE systems make it critical to select proper activation functions in PINNs. The common practice to find the optimal activation functions is by trial-and-error, which requires extensive computational resources and human knowledge. This method is inefficient especially when solving complex problems, where searching for a set of activation functions is necessary to make accurate predictions. For instance, a combination of the sinusoidal and the exponential function is demonstrated effective to solve the heat transfer equation, whose solution is periodic in space and exponentially decaying in time (Zobeiry & Humfeld, 2021) . In this case, the trial-and-error strategy leads to the combinatorial search problem, which becomes infeasible when the candidate activation function set is large. In this work, we propose a simple and effective physics-informed activation function (PIAC), aiming at the automatic design of activation functions for solving PDE systems with various characteristics. Sharing the same spirits with differentiable neural architecture search (Liu et al., 2018) , we parameterize the categorical selection of one particular activation function into a continuous search space, leading to an end-to-end differentiable problem which can be integrated into the training procedure of PINNs. Specifically, we first define a set of candidate activation functions, which is conceptually composed of simple elementary functions or commonly-used activation functions. Then, the proposed PIAC is learned as the linear combination of candidate functions with adaptive coefficients. Besides its efficient optimization, this continuous parameterization also enables the discovery of novel activation functions, whose capacity can be further enhanced by cascading these learnable functions in a layer-wise or neuron-wise manner. Our method can be regraded as adaptive activation functions built on a predefined candidate function set. Although adaptive activation functions have been explored for PINNs, previous works mainly focus on accelerating the convergence of PINNs by introducing the adaptive slope (Jagtap et al., 2020b;a), while leaving the inefficient selection of activation function for different PDE systems unexplored. In fact, our method is orthogonal to these methods and can be extended by incorporating the adaptive slope into our search space. Moreover, the introduction of candidate function set can be leveraged to embed prior knowledge about the PDE system into the neural networks. For example, the sinusoidal function can be added into this set to assist the modeling of periodicity. We evaluate the proposed PIAC on a series of PDEs, including the Poisson's equation, convection equation, Burgers' equation, Allen-Cahn equation, Korteweg-de Vries equation, and Cahn-Hilliard equation. Extensive experiments show that the proposed PIAC can consistently outperform the standard activation functions. We further explain the performance gain from the neural tangent kernel (NTK) perspective (Jacot et al., 2018; Wang et al., 2022) . The main contribution of this paper can be summarized as: • We investigate the influence of activation functions for PINNs to solve different problems and reveal the high sensitivity of PINNs to the choice of activation functions, which can be related to various characteristics of the underlying PDE system; • We explore the automatic design of activation functions for PINNs. The proposed method can be efficiently optimized and adapted to the PDE system, while enabling the utilization of prior knowledge about the problem; • While being simple, the effectiveness of the proposed PIAC is demonstrated on extensive experiments and interpreted from the perspective of neural tangent kernel.

2.1. PHYSICS-INFORMED NEURAL NETWORKS

Physics-informed neural networks have emerged as a promising method for solving forward and inverse problems of PDEs (Raissi et al., 2019a; Chen et al., 2020; Lu et al., 2021b; Karniadakis et al., 2021 ), fractional PDEs (Pang et al., 2019) , and stochastic PDEs (Zhang et al., 2019) . In this work, we focus on solving the forward problems of PDEs as described in (Raissi et al., 2019a) . Specifically, we consider PDEs of the general form u t + N [u(t, x); λ] = 0, x ∈ Ω ⊂ R d , t ∈ [0, T ], subject to the initial and boundary conditions u(0, x) = u 0 (x), x ∈ Ω, (2) B[u] = 0, x ∈ ∂Ω, t ∈ [0, T ], (3) where u(t, x) denotes the solution, N [•; λ] is a differential operator parameterized by λ, B[•] is a boundary operator, and subscripts denote the partial differentiation. A physics-informed neural network (PINN) u ′ (t, x; θ) is optimized to approximate the solution u(t, x) by minimizing the following objective function L(θ) = L ic (θ) + L bc (θ) + L r (θ),

