HIDDEN MARKOV MIXTURE OF GAUSSIAN PROCESS FUNCTIONAL REGRESSION: UTILIZING MULTI-SCALE STRUCTURE FOR TIME-SERIES FORECASTING

Abstract

The mixture of Gaussian process functional regressions (GPFRs) assumes that there are a batch of time-series or sample curves which are generated by independent random processes with different temporal structures. However, in the real situations, these structures are actually transferred in a random manner from a long time scale. Therefore, the assumption of independent curves is not true in practice. In order to get rid of this limitation, we propose the hidden Markov based GPFR mixture model (HM-GPFR) by describing these curves with both fine and coarse level temporal structures. Specifically, the temporal structure is described by the Gaussian process model at the fine level and hidden Markov process at the coarse level. The whole model can be regarded as a random process with state switching dynamics. To further enhance the robustness of the model, we also give a priori to the model parameters and develop Bayesian hidden Markov based GPFR mixture model (BHM-GPFR). Experimental results demonstrate that the proposed methods have both high prediction accuracy and good interpretability.

1. INTRODUCTION

The time-series considered in this paper has the multi-scale structure: the coarse level and the fine level. We have observations (y 1 , . . . , y T ) where each y t = (y t,1 , . . . , y t,L ) itself is a time-series of length L. The whole time-series is arranged as y 1,1 , y 1,2 , . . . , y 1,L , y 2,1 , y 2,2 , . . . , y 2,L , . . . , y T,1 , y T,2 , . . . , y T,L . (1) The subscripts of {y t } T t=1 are called coarse level indices, while the subscripts of {y t,i } L i=1 are called fine level indices. Throughout this paper, we take the electricity load dataset as a concrete example. The electricity load dataset consists of T = 365 consecutive daily records, and in each day there are L = 96 samples recorded every quarter-hour. In this example, the coarse level indices denote "day", while the fine level indices correspond to the time resolution of 15 minutes. The aim is to forecast both short-term and long-term electricity loads based on historical records. There may be partial observations y T +1,1 , . . . , y T +1,M with M < L, so the entire observed time-series has the form y 1,1 , y 1,2 , . . . , y 1,L , y 2,1 , y 2,2 , . . . , y 2,L , . . . , y T,1 , y T,2 , . . . , y T,L , y T +1,1 , . . . , y T +1,M . (2) The task is to predict future response y t * ,i * where t * ≥ T + 1, 1 ≤ i * ≤ L are positive integers. The coarse level and fine level provide different structural information about the data generation process. In the coarse level, each y t can be regarded as a time-series, and there is certain cluster structure (Shi & Wang, 2008; Wu & Ma, 2018) underlying these time-series {y t } T t=1 : we can divide {y t } T t=1 into groups such that time-series within each group share a similar evolving trend. Back to the electricity load dataset, such groups correspond to different electricity consumption patterns. We use z t to denote the cluster label of y t . In the fine level, observations {y t,i } L i=1 can be regarded as a realization of a stochastic process, and the properties of the stochastic process are determined by the cluster label z t . The mixture of Gaussian processes functional regression (mix-GPFR) model (Shi & Wang, 2008; Shi & Choi, 2011) is powerful for analyzing functional data or batch data, and it is applicable to the multi-scale time-series forecasting task. Mix-GPFR assumes there are K Gaussian processes

