MULTIWAVE: MULTIRESOLUTION DEEP ARCHITEC-TURES THROUGH WAVELET DECOMPOSITION FOR MUL-TIVARIATE TIMESERIES FORECASTING AND PREDIC-TION

Abstract

One of the challenges in multivariate time series modeling is that changes in signals occur with different frequencies, even when the sampling rate is consistent across signals. In the case of multivariate time series prediction, the outcome is also determined by patterns of different frequencies. These encapsulate both long-term and short-term effects, which have so far not been sufficiently leveraged by deep learning time series models. We fill this gap by introducing a framework, called MultiWave, which augments any deep learning time series model with components operating at the intrinsic frequencies of the signals. MultiWave applies wavelet decomposition on each signal to obtain subsignals of different frequencies and groups all subsignals in the same frequency band together to train a component. The output of the components is combined through a gating mechanism that removes irrelevant frequencies for the given predictive task. We show that MultiWave accurately determines the informative frequency bands and that the augmented models including components trained to operate on those bands outperform the original models. We further show that applying MultiWave on top of different deep learning models improves their performance in several real-world applications.

1. INTRODUCTION

Multivariate time series prediction has long been a crucial task in machine learning, as it has important applications in many fields such as healthcare, traffic flow, and economic forecasting. However, the final prediction in these applications can depend on many factors, such as information at different frequencies, long-term and short-term changes in input signals. Moreover, in many tasks, observations come from multiple sources and are often collected at various sampling rates. Here, we propose a model-agnostic approach that can leverage temporal dependencies at different frequencies and scales in multivariate time series data that might be collected with multiple sampling rates (multirate time series data) using multilevel discrete wavelet decomposition. There are two important categories of methods for time series analysis: Time-domain methods that consider the time series as a sequence of ordered points in time and frequency-domain methods that use transform algorithms such as Fourier transform and Z-transform to analyze the original sequence in the frequency spectrum. Deep learning-based methods that are introduced into time series analysis, such as recurrent neural networks (Williams & Zipser, 1989) , Convolutional Neural Networks (CNN) (Zheng et al., 2016) and more recently transformers (Wen et al., 2022) achieve state-of-the-art results in many applications (Lai et al., 2022; Tipirneni & Reddy, 2021; Huang et al., 2022) . However, they have two notable shortcomings in handling multivariate time series data. subsignals with similar frequencies into separate time series models, and then combine the output of the models to make a prediction. This framework brings the following improvements to multivariate time series modeling: 1) Model-agnostic, MultiWave can be applied to any neural network-based time series model. 2) Uses the information available in both the time and frequency domains. 3) Reduces the amount of variation between the sampling frequency of multiresolution signals that are modeled together. 4) Provides unique insight into which frequencies of the signals are important for a given task.

2.1. NOTATION

We denote multivariate and multirate time series data with m signals collected before time T as a set of signals X 1:T = {x 1:T 1 , x 1:T 2 , ..., x 1:T m } where each signal is collected at initial rates R = {r 1 , r 2 , ..., r m }. The length of each signal is proportional to its collected rate Len i = T ri . The problem is given X 1:T , we want to predict a label y, which can be continuous (regression) or discrete (classification). In the rest of the paper, we will remove the time indication for the signals and show the set of signals as X and the signal i as x i . We show the sampling rate of a signal x as f s (x).

2.2. MULTILEVEL DISCRETE WAVELET DECOMPOSITION

We use wavelet decomposition to break down the signals into different frequencies. Wavelet decompositions (Daubechies, 1992) are well-known methods for capturing information in time series in both the time and frequency domains. They have been used successfully as a preprocessing step for neural networks (Liu et al., 2013; Wang et al., 2020a) and as an integral part of them (Subasi et al., 2006; Zhang et al., 1995; Wang et al., 2018; Kumar et al., 2021) . Multilevel discrete wavelet decomposition can extract multilevel time-frequency features from a time series by iteratively applying low-pass and high-pass filters derived from wavelets to the signal. The formula for this decomposition is given below: x(t) ≈ k A L,k ϕ L,k (t) + k D L,k Ψ L,k (t)+ k D L-1,k Ψ L-1,k (t) + ... + k D 1,k Ψ 1,k (t) Ψ s,τ is the mother wavelet with scale s and time τ and ϕ is the father wavelet. This multilevel wavelet decomposition converts the input signal x(t) into signals A L = k A L,k , which is a coarse general approximation of the signal (low frequency) and the detail coefficients D L = k D L,k , D L-1 = k D L-1,k , . . . , D 1 = k D 1,k that influence the function on various scales. Figure 1 depicts this decomposition. To simplify the notation, we show the decomposition of a signal x as a set



Figure 1: Multi level Discrete Wavelet Decomposition, image on the left shows how lowpass and highpass filters are used to decompose signals in multi level discrete wavelet decomposition and the image on the right shows a signal getting decomposed by Haar wavelet. As you can see the resulted signals are zero except the signal matching the true frequency of the original signal.

