SYMBOLIC PHYSICS LEARNER: DISCOVERING GOV-ERNING EQUATIONS VIA MONTE CARLO TREE SEARCH

Abstract

Nonlinear dynamics is ubiquitous in nature and commonly seen in various science and engineering disciplines. Distilling analytical expressions that govern nonlinear dynamics from limited data remains vital but challenging. To tackle this fundamental issue, we propose a novel Symbolic Physics Learner (SPL) machine to discover the mathematical structure of nonlinear dynamics. The key concept is to interpret mathematical operations and system state variables by computational rules and symbols, establish symbolic reasoning of mathematical formulas via expression trees, and employ a Monte Carlo tree search (MCTS) agent to explore optimal expression trees based on measurement data. The MCTS agent obtains an optimistic selection policy through the traversal of expression trees, featuring the one that maps to the arithmetic expression of underlying physics. Salient features of the proposed framework include search flexibility and enforcement of parsimony for discovered equations. The efficacy and superiority of the SPL machine are demonstrated by numerical examples, compared with state-of-the-art baselines.

1. INTRODUCTION

We usually learn the behavior of a nonlinear dynamical system through its nonlinear governing differential equations. These equations can be formulated as ẏ(t) = dy/dt = F(y(t)), where y(t) = {y 1 (t), y 2 (t), ..., y n (t)} ∈ R 1×ns denotes the system state at time t, F(•) a nonlinear function set defining the state motions and n s the system dimension. The explicit form of F(•) for some nonlinear dynamics remains underexplored. For example, in a mounted double pendulum system, the mathematical description of the underlying physics might be unclear due to unknown viscous and frictional damping forms. These uncertainties yield critical demands for the discovery of nonlinear dynamics given observational data. Nevertheless, distilling the analytical form of governing equations from limited noisy data, commonly seen in practice, is an intractable challenge. Ever since the early work on the data-driven discovery of nonlinear dynamics (Džeroski & Todorovski, 1993; Dzeroski & Todorovski, 1995) , many scientists have stepped into this field of study. During the recent decade, the escalating advances in machine learning, data science, and computing power have enabled several milestone efforts of unearthing the governing equations for nonlinear dynamical systems. Notably, a breakthrough model named SINDy (Sparse Identification of Nonlinear Dynamics) (Brunton et al., 2016) has shed light on tackling this achallenge. SINDy was invented to determine the sparse solution among a pre-defined basis function library recursively through a sequential threshold ridge regression (STRidge) algorithm. SINDy quickly became one of the state-of-art methods and kindled significant enthusiasm in this field of study (Rudy et al., 2017; Long et al., 2018; Champion et al., 2019; Chen et al., 2021; Sun et al., 2021; Rao et al., 2022) . However, the success of this sparsity-promoting approach relies on a properly defined candidate function library that requires good prior knowledge of the system. It is also restricted by the fact that a linear combination of candidate functions might be insufficient to recover complicated mathematical expressions. Moreover, when the library size is massive, it empirically fails to hold the sparsity constraint. At the same time, attempts have been made to tackle the nonlinear dynamics discovery problems by introducing neural networks with activation functions replaced by commonly seen mathematical * Corresponding author 1

