SPECTRANET: MULTIVARIATE FORECASTING AND IM-PUTATION UNDER DISTRIBUTION SHIFTS AND MISSING DATA

Abstract

In this work, we tackle two widespread challenges in real applications for timeseries forecasting that have been largely understudied: distribution shifts and missing data. We propose SpectraNet, a novel multivariate time-series forecasting model that dynamically infers a latent space spectral decomposition to capture current temporal dynamics and correlations on the recent observed history. A Convolution Neural Network maps the learned representation by sequentially mixing its components and refining the output. Our proposed approach can simultaneously produce forecasts and interpolate past observations and can, therefore, greatly simplify production systems by unifying imputation and forecasting tasks into a single model. SpectraNetachieves SoTA performance simultaneously on both tasks on five benchmark datasets, compared to forecasting and imputation models, with up to 92% fewer parameters and comparable training times. On settings with up to 80% missing data, SpectraNethas average performance improvements of almost 50% over the second-best alternative.

1. INTRODUCTION

Multivariate time-series forecasting is an essential task in a wide range of domains. Forecasts are a key input to optimize the production and distribution of goods (Böse et al., 2017) , predict healthcare patient outcomes (Chen et al., 2015) , plan electricity production (Olivares et al., 2022) , build financial portfolios (Emerson et al., 2019) , among other examples. Due to its high potential benefits, researchers have dedicated many efforts to improving the capabilities of forecasting models, with breakthroughs in model architectures and performance (Benidis et al., 2022) . The main focus of research in multivariate forecasting has been on accuracy and scalability, to which the present paper contributes. In addition, we identify two widespread challenges for real applications which have been largely understudied: distribution shifts and missing data. We refer to distribution shifts as changes in the time-series behavior. In particular, we focus on discrepancies in distribution between the train and test data, which can considerably degrade the accuracy (Kuznetsov & Mohri, 2014; Du et al., 2021; Xu et al., 2022; Ivanovic et al., 2022) . This has become an increasing problem in recent years with the COVID-19 pandemic, which disrupted all aspects of human activities. Missing values is a generalized problem. Some common causes include faulty sensors, the impossibility of gathering data, and misplacement of information. As we demonstrate in our experiments, these challenges hinder the performance of current state-of-the-art (SoTA), limiting their use in applications where these problems are predominant. In this work, we propose SpectraNet, a novel multivariate forecasting model that achieves SoTA performance in benchmark datasets and is also intrinsically robust to distribution shifts and extreme cases of missing data. SpectraNet achieves its high accuracy and robustness by dynamically inferring a latent vector projected on a temporal basis, a process we name latent space spectral decomposition (LSSD). A series of convolution layers then synthesize both the reference window, which is used to infer the latent vectors and the forecast window. To the best of our knowledge, SpectraNet is also the first solution that can simultaneously forecast the future values of a multivariate time series and accurately impute the past missing data. 1

