ENHANCING THE INDUCTIVE BIASES OF GRAPH NEU-RAL ODE FOR MODELING PHYSICAL SYSTEMS

Abstract

Neural networks with physics-based inductive biases such as Lagrangian neural networks (LNN), and Hamiltonian neural networks (HNN) learn the dynamics of physical systems by encoding strong inductive biases. Alternatively, Neural ODEs with appropriate inductive biases have also been shown to give similar performances. However, these models, when applied to particle-based systems, are transductive in nature and hence, do not generalize to large system sizes. In this paper, we present a graph-based neural ODE, GNODE, to learn the time evolution of dynamical systems. Further, we carefully analyze the role of different inductive biases on the performance of GNODE. We show that similar to LNN and HNN, encoding the constraints explicitly can significantly improve the training efficiency and performance of GNODE significantly. Our experiments also assess the value of additional inductive biases, such as Newton's third law, on the final performance of the model. We demonstrate that inducing these biases can enhance the performance of the model by orders of magnitude in terms of both energy violation and rollout error. Interestingly, we observe that the GNODE trained with the most effective inductive biases, namely MCGNODE, outperforms the graph versions of LNN and HNN, namely, Lagrangian graph networks (LGN) and Hamiltonian graph networks (HGN) in terms of energy violation error by 4 orders of magnitude for a pendulum system, and 2 orders of magnitude for spring systems. These results suggest that NODE-based systems can give competitive performances with energy-conserving neural networks by employing appropriate inductive biases.

1. INTRODUCTION AND RELATED WORKS

Learning the dynamics of physical systems is a challenging problem that has relevance in several areas of science and engineering such as astronomy (motion of planetary systems), biology (movement of cells), physics (molecular dynamics), and engineering (mechanics, robotics) (LaValle, 2006; Goldstein, 2011) . The dynamics of a system are typically expressed as a differential equation, solutions of which may require the knowledge of abstract quantities such as force, energy, and drag (LaValle, 2006; Zhong et al., 2020; Sanchez-Gonzalez et al., 2020) . From an experimental perspective, the real observable for a physical system is its trajectory represented by the position and velocities of the constituent particles. Thus, learning the abstract quantities required to solve the equation, directly from the trajectory, can extremely simplify the problem of learning the dynamics (Finzi et al., 2020) . Infusing physical laws as prior has been shown to improve learning in terms of additional properties such as energy conservation, and symplectic nature (Karniadakis et al., 2021; Lutter et al., 2019; Liu et al., 2021) . To this extent, three broad approaches have been proposed, namely, Lagrangian neural networks (LNN) (Cranmer et al., 2020a; Finzi et al., 2020; Lutter et al., 2019) , Hamiltonian neural networks (HNN) (Sanchez-Gonzalez et al., 2019; Greydanus et al., 2019; Zhong et al., 2020; 2021) , and neural ODE (NODE) (Chen et al., 2018; Gruver et al., 2021) . The learning efficiency of LNNs and HNNs is shown to enhance significantly by employing explicit constraints (Finzi et al., 2020) and their inherent structure Zhong et al. (2019) . In addition, it has been shown that the superior performance of HNNs and LNNs is mainly due to their second-order bias, and not due to their symplectic or energy conserving bias (Gruver et al., 2021) . More specifically, an HNN with separable potential (V (q)) and kinetic (T (q, q)) energies is equivalent to a second order NODE of the form q = F (q, q). Thus, a NODE with a second-order bias can give similar performances to that of an HNN (Gruver et al., 2021) . Recently, a Bayesian-symbolic approach, which assumes the knowledge of Newtonian mechanics along with symbolic features, has been used to learn the dynamics of physical systems Xu et al. ( 2021). An important limitation pervading these models is their transductive nature; they need to be trained on each system separately before simulating their dynamics. For instance, a NODE trained on 5pendulum can work only for 5-pendulum. This approach is referred to as transductive. In contrast, an inductive model, such as GNODE, when trained on 5-pendulum "learns" the underlying function governing the dynamics at a node and edge level. Hence, a GNODE, once trained, can perform inference on any system size that is significantly smaller or larger than the ground truth as we demonstrate later in § 4. This inductive ability is enabled through Graph neural networks (Scarselli et al., 2008) due to their inherent topology-aware setting, which allows the learned system to naturally generalize. Although the idea of graph-based modeling has been suggested for physical systems (Cranmer et al., 2020b; Greydanus et al., 2019) , the inductive biases induced due to different graph structures and their consequences on the dynamics remain poorly explored. In this context, there are a few recent works that aim to develop better representations of physical systems through equivariant graph neural networks Satorras et al. ( 2021 2022). However, the present work focuses on the inductive biases for physical systems rather than focusing on better representation learning. Some of the proposed inductive biases may also be a natural consequence of forcing equivariance. For example, enforcing invariance to the Lagrangian w.r.t. space translation and rotation also enforces Newton's third law, which we achieve through an inductive bias ( § 3). In the context of generalizability to unseen systems, there exists work on few-shot based on meta-learning (Lee et al., 2021) . Meta-learning assumes the availability of a limited amount of training data to adapt to an unseen system (task). This is different from our objective where we perform zero-shot generalizability, i.e., inference on an unseen system without any further training. To summarize, our key contributions are as follows: 1. Topology-aware modeling, where the physical system is modeled using a graph-based NODE (GNODE), enabling zero-shot generalizability to unseen system sizes. 2. Decoupling the dynamics and constraints, where the equation of motion of the GNODE is decoupled into individual terms such as forces, Coriolis-like terms, and explicit constraints. 3. Decoupling the internal and body forces, where the internal forces due to the interaction among particles are decoupled from the forces due to external fields. 4. Newton's third law, where the forces of interacting particles are enforced to be equal and opposite. We theoretically prove that this inductive bias enforces the conservation of linear momentum, in the absence of an external field, for the predicted trajectory exactly. We analyze the role of these biases on n-pendulum and spring systems. Interestingly, depending on the nature of the system, we show that some of these biases can either significantly improve the performance or have marginal effects. We also show that the final model with the appropriate inductive biases significantly outperforms the simple version of GNODE with no additional inductive biases and even the graph versions of LNN and HNN.

2. BACKGROUND ON DYNAMICAL SYSTEMS

A dynamical system can be defined by ordinary differential equations (ODEs) of the form, q = F (q, q, t), where F represents the dynamics of the system. Thus, the time evolution of the system can be obtained by integrating this equation of motion. In Lagrangian mechanics, the dynamics of a system are derived based on the Lagrangian of the system, which is defined as (Goldstein, 2011): L(q, q, t) = T (q, q) -V (q) (1) Once the Lagrangian of the system is computed, the dynamics can be obtained using the Euler-Lagrange (EL) equation as d dt (∇ q L) = ∇ q L. Note that the EL equation is equivalent to D'Alembert's principle written in terms of the force as n i=1 (N i -m i qi ) • δq i = 0, where δq i represents a virtual displacement consistent with the constraints of the system. Thus, an alternative to the EL equation can be written in terms of the forces and other components as (LaValle, 2006; Murray et al., 2017) M (q)q + C(q, q) q + N (q) + Υ( q) + A T λ = Π (2) where M (q) represents the mass matrix, C(q, q) the Coriolis-like forces, N (q) the conservative forces, Υ( q) represents the non-conservative forces such as drag or friction, and Π represents any



); Han et al. (2022); Huang et al. (

