DISE: DYNAMIC INTEGRATOR SELECTION TO MINI-MIZE FORWARD-PASS TIME IN NEURAL ODES

Abstract

Neural ordinary differential equations (Neural ODEs) are appreciated for their ability to significantly reduce the number of parameters when constructing a neural network. On the other hand, they are sometimes blamed for their long forwardpass inference time, which is incurred by solving integral problems. To improve the model accuracy, they rely on advanced solvers, such as the Dormand-Prince (DOPRI) method. To solve an integral problem, however, it requires at least tens (or sometimes thousands) of steps in many Neural ODE experiments. In this work, we propose to i) directly regularize the step size of DOPRI to make the forwardpass faster and ii) dynamically choose a simpler integrator than DOPRI for a carefully selected subset of input. Because it is not the case that every input requires the advanced integrator, we design an auxiliary neural network to choose an appropriate integrator given input to decrease the overall inference time without significantly sacrificing accuracy. We consider the Euler method, the fourth-order Runge-Kutta (RK4) method, and DOPRI as selection candidates. We found that 10-30% of cases can be solved with simple integrators in our experiments. Therefore, the overall number of functional evaluations (NFE) decreases up to 78% with improved accuracy.

1. INTRODUCTION

Neural ordinary differential equations (Neural ODEs) are to learn time-dependent physical dynamics describing continuous residual networks (Chen et al., 2018) . It is well known that residual connections are numerically similar to the explicit Euler method, the simplest integrator to solve ODEs. In this regard, Neural ODEs are considered as a generalization of residual networks. In general, it is agreed by many researchers that Neural ODEs have two advantages and one disadvantage: i) Neural ODEs can sometimes reduce the required number of neural network parameters, e.g., (Pinckaers & Litjens, 2019) , ii) Neural ODEs can interpret the neural network layer (or time) as a continuous variable and a hidden vector at an arbitrary layer can be calculated, iii) however, Neural ODEs's forward-pass inference can sometimes be numerically unstable (i.e., the underflow error of DOPRI's adaptive step size) and/or slow to solve an integral problem (i.e., too many steps in DOPRI) (Zhuang et al., 2020b; Finlay et al., 2020; Daulbaev et al., 2020; Quaglino et al., 2020) . Much work has been actively devoted to address the numerically unstable nature of solving integral problems. In this work, however, we are interested in addressing the problem of long forward-pass inference time. To overcome the challenge, we i) directly regularize the numerical errors of the Dormand-Prince (DOPRI) method (Dormand & Prince, 1980) , which means we try to learn an ODE that can be quickly solved by DO-PRI, and ii) dynamically select an appropriate integrator for each sample rather than relying on only one integrator. In many cases, Neural ODEs use DOPRI, one of the most advanced adaptive step integrator, for its best accuracy. However, our method allows that we rely on simpler integrators, such as the Euler method or the 1





