LEARNING NEURAL EVENT FUNCTIONS FOR ORDINARY DIFFERENTIAL EQUATIONS

Abstract

The existing Neural ODE formulation relies on an explicit knowledge of the termination time. We extend Neural ODEs to implicitly defined termination criteria modeled by neural event functions, which can be chained together and differentiated through. Neural Event ODEs are capable of modeling discrete and instantaneous changes in a continuous-time system, without prior knowledge of when these changes should occur or how many such changes should exist. We test our approach in modeling hybrid discrete-and continuous-systems such as switching dynamical systems and collision in multi-body systems, and we propose simulation-based training of point processes with applications in discrete control.

1. INTRODUCTION

Event handling in the context of solving ordinary differential equations (Shampine & Thompson, 2000) allows the user to specify a termination criteria using an event function. Part of the reason is to introduce discontinuous changes to a system that cannot be modeled by an ODE alone. Examples being collision in physical systems, chemical reactions, or switching dynamics (Ackerson & Fu, 1970) . Another part of the motivation is to create discrete outputs from a continuous-time process; such is the case in point processes and event-driven sampling (e.g. Steinbrecher & Shaw (2008) ; Peters et al. (2012) ; Bouchard-Côté et al. (2018) ). In general, an event function is a tool for monitoring a continuous-time system and performing instantaneous interventions when events occur. The use of ordinary differential equation (ODE) solvers within deep learning frameworks has allowed end-to-end training of Neural ODEs (Chen et al., 2018) in a variety of settings. Examples include graphics (Yang et al., 2019; Rempe et al., 2020; Gupta & Chandraker, 2020) , generative modeling (Grathwohl et al., 2018; Zhang et al., 2018; Chen & Duvenaud, 2019; Onken et al., 2020) , time series modeling (Rubanova et al., 2019; De Brouwer et al., 2019; Jia & Benson, 2019; Kidger et al., 2020), and physics-based models (Zhong et al., 2019; Greydanus et al., 2019) . However, these existing models are defined with a fixed termination time. To further expand the applications of Neural ODEs, we investigate the parameterization and learning of a termination criteria, such that the termination time is only implicitly defined and will depend on changes in the continuous-time state. For this, we make use of event handling in ODE solvers and derive the gradients necessarily for training event functions that are parameterized with neural networks. By introducing differentiable termination criteria in Neural ODEs, our approach allows the model to efficiently and automatically handle state discontinuities.

1.1. EVENT HANDLING

Suppose we have a continuous-time state z(t) that follows an ODE dz dt = f (t, z(t), θ)-where θ are parameters of f -with an initial state z(t 0 ) = z 0 . The solution at a time value τ can be written as ODESolve(z 0 , f, t 0 , τ, θ) z(τ ) = z 0 + τ t0 f (t, z(t), θ) dt. (1) In the context of a Neural ODE, f can be defined using a Lipschitz-continuous neural network. Bouncing ball example As a motivating example of a system with discontinuous transitions, consider modeling a bouncing ball with classical mechanics. In an environment with constant gravity, a Markov state for representing the ball is a combination of position x(t) ∈ R and velocity v(t) ∈ R, z(t) = [x(t), v(t)], dz(t) dt = [v(t), a], where a is a scalar for acceleration, in our context a gravitational constant. To simulate this system, we need to be mindful that the ball will eventually pass through the ground-when x(t) ≤ r for some r that is the radius of the ball-but when it hits the ground, it bounces back up. At the moment of impact, the sign of the velocity is changed instantaneously. However, no such ODE can model this behavior because v(t) needs to change discontinuously at the moment of impact. This simple bouncing ball is an example of a scenario that would be ill-suited for a Neural ODE alone to model. In order to model this discontinuity in the state, we can make use of event functions. Event functions allow the ODE to be terminated when a criteria is satisfied, at which point we can instantaneously modify the state and then resume solving the ODE with this new state. Concretely, let g(t, z(t), φ) be an event function with φ denoting a set of parameters. An ODE solver with event handling capabilities will terminate at the first occurrence when the event function crosses zero, i.e. time t * such that g(t * , z(t * ), φ) = 0, conditioned on some initial value. We express this relationship as t * , z(t * ) = ODESolveEvent(z 0 , f, g, t 0 , θ, φ). (3) Note that in contrast to eq. ( 1), there is no predetermined termination time. The time of termination t * has to be solved alongside the initial value problem as it depends on the trajectory z(t). Nevertheless, ODESolveEvent strictly generalizes ODESolve since the event function can simply encode an explicit termination time and is reduced back into an ODESolve. The benefits of using ODESolveEvent lie in being able to define event functions that depend on the evolving state. Going back to the bouncing ball example, we can simply introduce an event function to detect when the ball hits the ground, i.e. g(t, z(t), φ) = x(t) -r. We can then instantaneously modify the state so that z ( t * ) = [x(t * ), -(1 -α)v(t * )], where α is the fraction of momentum that is absorbed by the contact, and then resume solving the ODE in eq. ( 2) with this new state z (t * ). Figure 1 shows the bouncing ball example being fit by a Neural ODE and a Neural Event ODE where both f and g are neural networks. The Neural ODE model parameterizes a non-linear function for f while the Neural Event ODE parameterizes f and g as linear functions of z(t). We see that the Neural Event ODE can perfectly recover the underlying physics and extrapolate seamlessly. Meanwhile, the Neural ODE has trouble fitting to the sudden changes in dynamics when the ball bounces off the ground, and furthermore, does not generalize because the true model requires the trajectory to be discontinuous.



Dynamics of a bouncing ball can be recovered by a Neural Event ODE. Meanwhile, a non-linear Neural ODE has trouble modeling sudden changes and performs poorly at extrapolation. t. While smooth trajectories can be a desirable property in some settings, trajectories modeled by an ODE can have limited representation capabilities(Dupont et al., 2019; Zhang et al., 2020)  and in some applications, it is desirable to model discontinuities in the state.

