LEARNING FAST AND SLOW FOR ONLINE TIME SERIES FORECASTING

Abstract

Despite the recent success of deep learning for time series forecasting, these methods are not scalable for many real-world applications where data arrives sequentially. Training deep neural forecasters on the fly is notoriously challenging because of their limited ability to adapt to non-stationary environments and remember old knowledge. We argue that the fast adaptation capability of deep neural networks is critical and successful solutions require handling changes to both new and recurring patterns effectively. In this work, inspired by the Complementary Learning Systems (CLS) theory, we propose Fast and Slow learning Network (FSNet) as a novel framework to address the challenges of online forecasting. Particularly, FSNet improves the slowly-learned backbone by dynamically balancing fast adaptation to recent changes and retrieving similar old knowledge. FSNet achieves this mechanism via an interaction between two novel complementary components: (i) a per-layer adapter to support fast learning from individual layers, and (ii) an associative memory to support remembering, updating, and recalling repeating events. Extensive experiments on real and synthetic datasets validate FSNet's efficacy and robustness to both new and recurring patterns.

1. INTRODUCTION

Time series forecasting plays an important role in both research and industries. Correctly forecast time series can greatly benefit various business sectors such as traffic management and electricity consumption (Hyndman & Athanasopoulos, 2018) . As a result, tremendous efforts have been devoted to develop better forecasting models (Petropoulos et al., 2020; Bhatnagar et al., 2021; Triebe et al., 2021) , with a recent success of deep neural networks (Li et al., 2019; Xu et al., 2021; Yue et al., 2021; Zhou et al., 2021) thanks to their impressive capabilities to discover hierarchical latent representations and complex dependencies. However, such studies focus on the batch learning setting which requires the whole training dataset to be made available a priori and implies the relationship between the input and outputs remains static throughout. This assumption is restrictive in real-world applications, where data arrives in a stream and the input-output relationship can change over time (Gama et al., 2014) . In such cases, re-training the model from scratch could be time consuming. Therefore, it is desirable to train the deep forecaster online (Anava et al., 2013; Liu et al., 2016) using only new samples to capture the changing dynamic of the environment. Despite the ubiquitous of online learning in many real-world applications, training deep forecasters online remains challenging for two reaons. First, naively train deep neural networks on data streams requires many samples to converge (Sahoo et al., 2018; Aljundi et al., 2019a) because the offline training benefits such as mini-batches or training for multiple epochs are not available. Therefore, when a distribution shift happens (Gama et al., 2014) , such cumbersome models would require many samples to learn new concepts with satisfactory results. Overall, deep neural networks, although possess strong representation learning capabilities, lack a mechanism to facilitate successful learning on data streams. Second, time series data often exhibit recurrent patterns where one pattern

availability

Our code is publicly available at: https://github.com/salesforce/fsnet/.

