KOOPMAN NEURAL FORECASTER FOR TIME SERIES WITH TEMPORAL DISTRIBUTION SHIFTS

Abstract

Temporal distributional shifts, with underlying dynamics changing over time, frequently occur in real-world time series, and pose a fundamental challenge for deep neural networks (DNNs). In this paper, we propose a novel deep sequence model based on the Koopman theory for time series forecasting: Koopman Neural Forecaster (KNF) that leverages DNNs to learn the linear Koopman space and the coefficients of chosen measurement functions. KNF imposes appropriate inductive biases for improved robustness against distributional shifts, employing both a global operator to learn shared characteristics, and a local operator to capture changing dynamics, as well as a specially-designed feedback loop to continuously update the learnt operators over time for rapidly varying behaviors. We demonstrate that KNF achieves the superior performance compared to the alternatives, on multiple time series datasets that are shown to suffer from distribution shifts.

1. INTRODUCTION

Temporal distribution shifts frequently occur in real-world time-series applications, from forecasting stock prices to detecting and monitoring sensory measures, to predicting fashion trend based sales. Such distribution shifts over time may due to the data being generated in a highly-dynamic and non-stationary environment, abrupt changes that are difficult to predict, or constantly evolving trends in the underlying data distribution (Gama et al., 2014) . Temporal distribution shifts pose a fundamental challenge for time-series forecasting (Kuznetsov & Mohri, 2020) . There are two scenarios of distribution shifts. When the distribution shifts only occur between the training and test domains, meta learning and transfer learning approaches (Jin et al., 2021; Oreshkin et al., 2021) have been developed. The other scenario is much more challenging: distribution shifts occurring continuously over time. This scenario is closely related to "concept drift" (Lu et al., 2018) and non-stationary processes (Dahlhaus, 1997) but has received less attention from the deep learning community. In this work, we focus on the second scenario. To tackle temporal distribution shifts, various statistical estimation methods have been studied, including spectral density analysis (Dahlhaus, 1997), sample reweighting (Bennett & Clarkson, 2022; McCarthy & Jensen, 2016) and Bayesian state-space models (West & Harrison, 2006) . However, these methods are limited to low capacity auto-regressive models and are typically designed for short-horizon forecasting. For large-scale complex time series data, deep learning models (Oreshkin et al., 2021; Woo et al., 2022; Tonekaboni et al., 2022; Zhou et al., 2022) now increasingly outperform traditional statistical methods. Yet, most deep learning approaches are designed for stationary timeseries data (with i.i.d. assumption), such as electricity usage, sales and air quality, that have clear seasonal and trend patterns. For distribution shifts, DNNs have been shown to be problematic in forecasting on data with varying distributions (Kouw & Loog, 2018; Wang et al., 2021) . DNNs are black-box models and often require a large number of samples to learn. For time series with continuous distribution shifts, the number of samples from a given distribution is small, thus DNNs would struggle to adapt to the changing distribution. Furthermore, the non-linear dependencies in a DNN are difficult to interpret or manipulate. Directly modifying the parameters based on the change in dynamics may lead to undesirable effects (Vlachas et al., 2020) . Therefore, if we can reduce non-linearity and simplify dynamics modeling, then we would be able to model time series in a much more interpretable and robust manner. Koopman theory (Koopman, 1931) provides convenient tools to simplify the dynamics modeling. It states that any nonlinear dynamics can be modeled by a linear Koopman operator acting on the space of measurement functions (Brunton et al., 2021) , thus the dynamics can be manipulated by simply modifying the Koopman matrix. In this paper, we propose a novel approach for accurate forecasting for time series with distribution shifts based on Koopman theory: Koopman Neural Forecaster (KNF). Our model has three main features: 1) we combine predefined measurement functions with learnable coefficients to introduce appropriate inductive biases into the model. 2) our model employs both global and local Koopman operators to approximate the forward dynamics: the global operator learns the shared characteristics; the local operator captures the local changing dynamics. 3) we also integrate a feedback loop to cope with distribution shifts and maintain the model's long-term forecasting accuracy. The feedback loop continuously updates the learnt operators over time based on the current prediction error. Leveraging Koopman theory brings multiple benefits to time series forecasting with distribution shifts: 1) using predefined measurement functions (e.g., exponential, polynomial) provide sufficient expressivity for the time series without requiring a large number of samples. 2) since the Koopman operator is linear, it is much easier to analyze and manipulate. For instance, we can perform spectral analysis and examine its eigenfunctions, reaching a better understanding of the frequency of oscillation. 3) Our feedback loop makes the Koopman operator adaptive to non-stationary environment. This is fundamentally different from previous works that learns a single and fixed Koopman operator (Han et al., 2020; Takeishi et al., 2017; Azencot et al., 2020) . In summary, our major contributions include: • Proposing a novel deep forecasting model based on Koopman theory for time-series data with temporal distributional shifts. • The proposed approach allows the Koopman matrix to both capture the global behaviors and evolve over time to adapt to local changing distributions. • Demonstrating state-of-the-art performance on highly non-stationary time series datasets, including M4, cryptocurrency return forecasting and sports player trajectory prediction. • Generating interpretable insights for the model behavior via eigenvalues and eigenfunctions of the Koopman operators.

2. RELATED WORK

DNNs for time-series forecasting. DNNs are shown to increasingly outperform traditional statistical methods (such as exponential smoothing (ETS) (Gardner Jr, 1985) et al., 2020; Malinin et al., 2021) . Smyl (2020) combines ETS with a RNN, where the seasonality and



and ARIMA (Ariyo et al., 2014)) for time series forecasting. For example, Tonekaboni et al. (2022); Wang et al. (2019) proposed to use DNNs to learn the local and global representations of time series separately, showing high accuracy on sales and weather data. Woo et al. (2022) leverages inductive biases in different architectures and also specially-designed contrastive loss to learn disentangled seasonal and trend representations. Sen et al. (2019) utilized a global TCN to avoid normalization before training when there are wide variations in scale. But it focuses mainly on better modeling the relationships between time series rather than advances in modeling over time as ours. Transformer-based approaches are particularly effective in time series forecasting, particularly on datasets including electricity and traffic(Zhou et al., 2022; Wu et al., 2021; Zhou et al., 2021), which are relatively stationary and have clear seasonality and trend dynamics.Robustness against temporal distribution shifts. Non-stationarity poses a great challenge for time series forecasting. To cope with varying distributions, one approach is to stationarize the input data.Kim et al. (2022)  proposes a reversible instance normalization technique applied on data to alleviate the temporal distribution shift problem. Similarly, Passalis et al. (2019) utilizes a DNN to adaptively stationarize input time series. But these approaches did not improve the generalizability of DNNs.Liu et al. (2022)  proposes a normalization-denormalization technique to stationarize time series, but only for transformer-based models.(Arik et al., 2022)  proposes test-time adaptation with a self-supervised objective to better adapt against distribution shifts. Another line of work is to combine DNNs and statistical approaches, for better accuracy on non-stationary time series data (Makridakis

availability

//github.com/google-research/ google-research/tree/

