SEQUENTIAL LATENT VARIABLE MODELS FOR FEW-SHOT HIGH-DIMENSIONAL TIME-SERIES FORECASTING

Abstract

Modern applications increasingly require learning and forecasting latent dynamics from high-dimensional time-series. Compared to univariate time-series forecasting, this adds a new challenge of reasoning about the latent dynamics of an unobserved abstract state. Sequential latent variable models (SLVMs) present an attractive solution, although existing works either struggle with long-term forecasting or have difficulty learning across diverse dynamics. In this paper, we first present a conceptual framework of SLVMs to unify existing works, contrast their fundamental limitations, and identify an intuitive solution to long-term forecasting for diverse dynamics via meta-learning. We then present a few-shot forecasting framework for high-dimensional time-series: instead of learning a single dynamic function, we leverage data of diverse dynamics and learn to adapt latent dynamic functions to few-shot support series. This is realized via Bayesian meta-learning underpinned by: 1) a latent dynamic function conditioned on knowledge derived from few-shot support series, and 2) a meta-model that learns to extract such dynamic-specific knowledge via feed-forward embedding of support set. We compared the presented framework with a comprehensive set of baseline models 1) trained globally on the large meta-training set with diverse dynamics, 2) trained individually on single dynamics with and without fine-tuning to k-shot support series, and 3) extended to few-shot meta-formulations. We demonstrated that the presented framework is agnostic to the latent dynamic function of choice and, at meta-test time, is able to forecast for new dynamics given variable-shot of support series.

1. INTRODUCTION

In many applications, an ultimate goal is to forecast the future states or trajectories of a dynamic system from its high-dimensional observations such as series of images. Compared to the relatively well-studied univariate time-series forecasting (Makridakis et al., 2018; Oreshkina et al., 2020; Salinas et al., 2020) , high-dimensional time-series forecasting raises new challenges: it requires the extraction of the dynamics of an abstract latent state that is not directly observed (Botev et al., 2021) . Sequential latent variable models (SLVMs) provide an attractive solution that, unlike autoregressive models, abstracts a latent dynamic function z i = f (z <i ; θ z ) with state z i and parameter θ z , along with z i 's emission to observations x i = g(z i ) (Chung et al., 2015) . This pair of learned models can support long-term forecasting given only initial frames of observations, as well as controlled generation of new dynamics. Critical bottlenecks however remain in reaching these goals. The earlier formulation of SLVMs relies on a natural extension of the static LVMs: as illustrated in Fig. 1A , the latent state z i is modeled as the latent variable for the generation of x i , and a sequential encoder is used to facilitate the inference of z i from current and past observations x ≤i (Chung et al., 2015; Krishnan et al., 2017) . Recent works have argued to instead model and infer the parameter of the latent dynamic function, often modeled as time-varying linear coefficients θ z,i (Karl et al., 2017; Fraccaro et al., 2017; Rangapuram et al., 2018; Klushyn et al., 2021) . This results in an LVM formulation as illustrated in Fig. 1B1 , where the latent variable θ z,i is inferred at each i from observations x ≤i . While strong at time-series reconstructions and classifications, a fundamental limitation makes these SLVMs less suited for long-term forecasting: the latent dynamic function has a limited ability to forecast without near-term observations to support the inference of z i or θ z,i . This limitation in the mainstream SLVMs raises a natural question: are we able to relax the assumption of linear dynamic function and directly infer its θ z ? Works adopting this idea have emerged: as illustrated in Fig. 1B2 , by modeling a single θ z -either deterministic (Rubanova et al., 2019; Botev et al., 2021) or stochastic (Yildiz et al., 2020) -f (z <i ; θ z ) can be asked to predict a time sequence using only an inferred initial state. This formulation has shown strong long-term forecasting, although with its own fundamental limitation: it learns a single dynamic function global to all training sequences. This would not only require all training data to share identical latent dynamics, but also has difficulty to forecast test sequences with dynamics different from or unknown to the training. In this paper, we answer this important open question of long-term forecasting for diverse dynamics. We first present a conceptual framework of SLVMs to unify existing works, and identify an intuitive solution to the underlying critical gap via meta-learning: instead of learning a single dynamic function, we can learn to pull knowledge across datasets of different dynamics and learn to adapt a dynamic function to few-shot high-dimensional time-series. We then present a Bayesian meta-learning framework as illustrated in Fig. 1C : instead of being a single fixed function as in Fig. 1B2 , we let the latent dynamic function be conditioned on knowledge derived from few-shot support time-series via a feed-forward set-embedding meta-model; given k-shot time-series of a specific dynamics, the model is asked to forecast for query time-series using only the initial frames, meta-learned across dynamics. We develop this framework to be agnostic to the latent dynamic functions of choice, and with the flexibility to forecast with a variable size of k. We evaluated the presented framework in benchmark image sequences with mixed physics including bouncing balls (Fraccaro et al., 2017 ), pendulum (Botev et al., 2021 ), and mass-spring (Botev et al., 2021) . We further applied it to forecasting complex physics of turbulence flow (Wang et al., 2021) and electrical dynamics over 3D geometrical meshes of the heart. We compared the presented work with SLVMs representative of each of the formulations in Fig. 1A-B , along with a recent autoregressive model designed to forecast for diverse dynamics (Donà et al., 2020) . Each baseline model was trained on 1) the large meta-training set with diverse dynamics, and 2) each dynamics individually, both with and without fine-tuning to k-shot support data. Representative SLVMs were further tested in their feed-forward or optimization-based meta-extensions. Results demonstrated clear margins of improvements by the presented work in forecasting diverse dynamics, with added ability to recognize clusters of distinct dynamics and allow controlled time-series generation given only initial conditions.

2. RELATED WORKS & BACKGROUND

Sequential LVMs: Among the first SLVMs is the variational recurrent neural networks (VRNN) (Chung et al., 2015) , followed by a series of deep state-space models (SSMs) (Krishnan et al., 2017;  



Figure 1: Sequential latent-variable models for forecasting high-dimensional sequences.

