CUBIC SPLINE SMOOTHING COMPENSATION FOR IRREGULARLY SAMPLED SEQUENCES

Abstract

The marriage of recurrent neural networks and neural ordinary differential networks (ODE-RNN) is effective in modeling irregularly sampled sequences. While ODE produces the smooth hidden states between observation intervals, the RNN will trigger a hidden state jump when a new observation arrives and thus cause the interpolation discontinuity problem. To address this issue, we propose the cubic spline smoothing compensation, which is a stand-alone module upon either the output or the hidden state of ODE-RNN and can be trained end-to-end. We derive its analytical solution and provide its theoretical interpolation error bound. Extensive experiments indicate its merits over both ODE-RNN and cubic spline interpolation.

1. INTRODUCTION

Recurrent neural networks (RNNs) are commonly used for modeling regularly sampled sequences (Cho et al., 2014) . However, the standard RNN can only process discrete series without considering the unequal temporal intervals between sample points, making it fail to model irregularly sampled time series commonly seen in domains, e.g., healthcare (Rajkomar et al., 2018) and finance (Fagereng & Halvorsen, 2017) . While some works adapt RNNs to handle such irregular scenarios, they often assume an exponential decay (either at the output or the hidden state) during the time interval between observations (Che et al., 2018; Cao et al., 2018) , which may not always hold. To remove the exponential decay assumption and better model the underlying dynamics, Chen et al. (2018) proposed to use the neural ordinary differential equation (ODE) to model the continuous dynamics of hidden states during the observation intervals. Leveraging a learnable ODE parametrized by a neural network, their method renders higher modeling capability and flexibility. However, an ODE determines the trajectory by its initial state, and it fails to adjust the trajectory according to subsequent observations. A popular way to leverage the subsequent observations is ODE-RNN (Rubanova et al., 2019; De Brouwer et al., 2019) , which updates the hidden state upon observations using an RNN, and evolves the hidden state using an ODE between observation intervals. While ODE produces smooth hidden states between observation intervals, the RNN will trigger a hidden state jump at the observation point. This inconsistency (discontinuity) is hard to reconcile, thus jeopardizing continuous time series modeling, especially for interpolation tasks (Fig. 1 top-left ). We propose a Cubic Spline Smoothing Compensation (CSSC) module to tackle the challenging discontinuity problem, and it is especially suitable for continuous time series interpolation. Our CSSC employs the cubic spline as a means of compensation for the ODE-RNN to eliminate the jump, as illustrated in Fig. 1 top-right. While the latent ODE (Rubanova et al., 2019) with an encoder-decoder structure can also produce continuous interpolation, CSSC can further ensure the interpolated curve pass strictly through the observation points. Importantly, we can derive the closed-form solution for CSSC and obtain its interpolation error bound. The error bound suggests two key factors for a good interpolation: the time interval between observations and the performance of ODE-RNN. Furthermore, we propose the hidden CSSC that aims to compensate for the hidden state of ODE-RNN (Fig. 1 bottom), which not only assuage the discontinuity problem but is more efficient when the observations are high-dimensional and only have continuity on the semantic level. We conduct extensive experiments and ablation studies to demonstrate the effectiveness of CSSC and hidden CSSC, and both of them outperform other comparison methods. 

2. RELATED WORK

Spline interpolation is a practical way to construct smooth curves between a number of points (De Boor et al., 1978) , even for unequally spaced points. Cubic spline interpolation leverages the piecewise third order polynomials to avoid the Runge's phenomenon (Runge, 1901) and is applied as a classical way to impute missing data (Che et al., 2018) . Recent literature focuses on adapting RNNs to model the irregularly sampled time series, given their strong modeling ability. Since standard RNNs can only process discrete series without considering the unequal temporal intervals between sample points, different improvements were proposed. One solution is to augment the input with the observation mask or concatenate it with the time lag ∆t and expect the network to use interval information ∆t in an unconstrained manner (Lipton et al., 2016; Mozer et al., 2017) . While such a flexible structure can achieve good performance under some circumstances (Mozer et al., 2017) Another track is the probabilistic generative model. Due to the ability to model the missing data's uncertainty, Gaussian processes (GPs) are adopted for missing data imputing (Futoma et al., 2017; Tan et al., 2020; Moor et al., 2019) . However, this approach introduced several hyperparameters, such as the covariance function, making it hard to fine-tune in practice. Neural processes (Garnelo et al., 2018) eliminate such constraints by introducing a global latent variable that represents the whole process. Generative adversarial networks are also adopted for imputing (Luo et al., 2018) . While the ODE produces the smooth hidden states between observation intervals, the RNN will trigger a jump of the hidden state at the observation point, leading to a discontinuous hidden state along the trajectory. This inconsistency (discontinuity) is hard to reconcile, thus jeopardizing the modeling of continuous time series, especially for interpolation tasks. The neural CDE (Kidger



, a more popular way is to use prior knowledge for missing data imputation. GRU-D (Che et al., 2018) imputes missing values with the weighted sum of exponential decay of the previous observation and the empirical mean. Shukla & Marlin (2019) employs the radial basis function kernel to construct an interpolation network. Cao et al. (2018) let hidden state exponentially decay for non-observed time points and use bi-directional RNN for temporal modeling.

Recently, neural ODEs(Chen et al., 2018)  utilize a continuous state transfer function parameterized by a neural network to learn the temporal dynamics. Rubanova et al. (2019) combine the RNN and ODE to reconcile both the new observation and latent state evolution between observations. De Brouwer et al. (2019) update the ODE with the GRU structure with Bayesian inference at observations.

The illustration of ODE-RNN and our methods. The top left is ODE-RNN showing the interpolation curve jump at the observation points. The top right is the smoothed output by our CSSC, where the jump is eliminated, and the output strictly passes the observation. The bottom left shows the ODE-RNN 's discontinuous output caused by the hidden state discontinuity. The bottom right shows that our hidden CSSC is applied to the hidden state of ODE-RNN, resulting in the smooth hidden state and so as the output.

