NEURAL DAES: CONSTRAINED NEURAL NETWORKS

Abstract

In this article we investigate the effect of explicitly adding auxiliary trajectory information to neural networks for dynamical systems. We draw inspiration from the field of differential-algebraic equations and differential equations on manifolds and implement similar methods in residual neural networks. We discuss constraints through stabilization as well as projection methods, and show when to use which method based on experiments involving simulations of multi-body pendulums and molecular dynamics scenarios. Several of our methods are easy to implement in existing code and have limited impact on training performance while giving significant boosts in terms of inference.

1. INTRODUCTION

Many scientific simulations of dynamical systems have natural invariants that can be expressed by constraints. Such constraints represent conservation of some quantities of the system under study. For example, in molecular dynamics, bond lengths between atoms are assumed fixed. Another example is incompressible fluid flow, where the divergence of the velocity field vanishes at any point in space and time. Similarly, in Maxwell's equations, the divergence of the magnetic field vanishes (no magnetic charge). Such additional information about the flow can be crucial if we are to keep the simulations faithful. As a result, a wealth of techniques have been proposed to conduct simulations that obey the constraints at least approximately (Weiglhofer, 1994; Ascher & Petzold, 1998; Allen et al., 2004) . In recent years, machine learning based techniques, and in particular deep neural networks, have been taking a growing role in modelling physical phenomena. In some cases, such techniques are used as inexpensive surrogates of the true physical dynamics and in others they are used to replace it altogether (see e.g. Wang et al. (2018) ; Degiacomi (2019); Miyanawala & Jaiman (2017)). These techniques use the wealth of data, either observed or numerically generated, in order to "learn" the parameters in a neural network, so that the data are fit to some accuracy. The network is then expected to perform well on new data, outside of the training set and yield simulation results that are accurate and reliable, in many cases, at significantly less computational effort. From classical simulations, we know that it is often as important to accurately obey the additional constraint information as it is to accurately satisfy the underlying ordinary/partial differential equation (ODE/PDE) system. Nonetheless, no neural network architecture known to us is designed to honour such constraints or invariants. The hope in standard training procedures is that by fitting the data, the network will "learn" the constraints and embed them in the weights implicitly. This, however, has been demonstrated to be insufficient in many cases (Wah & Qian, 2001 ). As we show in this paper, on some very simple examples, neural networks may be able to approximately learn the dynamics but they can drift off the constraints manifold. This leads to erroneous results that violate simple underlying physical properties. The question then is: How should additional constraints information be incorporated in a neural network architecture so that this physical information is, at least approximately, honoured? The idea of adding constraints information to a network is essentially a continuation of the ongoing process of connecting mathematics with machine learning and explicitly adding known information into a neural network rather than having the network implicitly learn it (Willard et al., 2021) . Equivariant networks are one example of this, where the symmetry of a problem is explicitly built into the neural network (Thomas et al., 2018) . Previous work on adding constraints to a neural network includes Stewart & Ermon (2017); Xu et al. (2018) which add an auxiliary regularization term to the loss function; Raissi et al. ( 2019) physics informed networks shape the output of the neural network to fulfill a particular partial differential equation; while Li & Srikumar (2019) incorporates first order logic directly into the neural network.

