DISCRETE GRAPH STRUCTURE LEARNING FOR FORE-CASTING MULTIPLE TIME SERIES

Abstract

Time series forecasting is an extensively studied subject in statistics, economics, and computer science. Exploration of the correlation and causation among the variables in a multivariate time series shows promise in enhancing the performance of a time series model. When using deep neural networks as forecasting models, we hypothesize that exploiting the pairwise information among multiple (multivariate) time series also improves their forecast. If an explicit graph structure is known, graph neural networks (GNNs) have been demonstrated as powerful tools to exploit the structure. In this work, we propose learning the structure simultaneously with the GNN if the graph is unknown. We cast the problem as learning a probabilistic graph model through optimizing the mean performance over the graph distribution. The distribution is parameterized by a neural network so that discrete graphs can be sampled differentiably through reparameterization. Empirical evaluations show that our method is simpler, more efficient, and better performing than a recently proposed bilevel learning approach for graph structure learning, as well as a broad array of forecasting models, either deep or non-deep learning based, and graph or non-graph based.

1. INTRODUCTION

Time series data are widely studied in science and engineering that involve temporal measurements. Time series forecasting is concerned with the prediction of future values based on observed ones in the past. It has played important roles in climate studies, market analysis, traffic control, and energy grid management (Makridakis et al., 1997) and has inspired the development of various predictive models that capture the temporal dynamics of the underlying system. These models range from early autoregressive approaches (Hamilton, 1994; Asteriou & Hall, 2011) to the recent deep learning methods (Seo et al., 2016; Li et al., 2018; Yu et al., 2018; Zhao et al., 2019) . Analysis of univariate time series (a single longitudinal variable) has been extended to multivariate time series and multiple (univariate or multivariate) time series. Multivariate forecasting models find strong predictive power in stressing the interdependency (and even causal relationship) among the variables. The vector autoregressive model (Hamilton, 1994) is an example of multivariate analysis, wherein the coefficient magnitudes offer hints into the Granger causality (Granger, 1969) of one variable to another. For multiple time series, pairwise similarities or connections among them have also been explored to improve the forecasting accuracy (Yu et al., 2018) . An example is the traffic network where each node denotes a time series recording captured by a particular sensor. The spatial connections of the roads offer insights into how traffic dynamics propagates along the network. Several graph neural

