CLINICALLY RELEVANT UNSUPERVISED ONLINE REP-RESENTATION LEARNING OF ICU WAVEFORMS

Abstract

Univariate high-frequency time series with real-time state changes are prominent in medical, economic and environmental applications. In the intensive care unit, for example, changes and intracranial pressure waveforms can indicate whether a patient is developing decreased blood perfusion to the brain during a stroke. However, most representation learning to resolve states is conducted in an offline, batch-dependent manner. In high frequency time-series, high intra-state and inter-sample variability makes offline, batch-dependent learning a relatively difficult task. Hence, we propose Spatial Resolved Temporal Networks (SpaRTeN), a novel composite deep learning model for online, unsupervised representation learning through a spatially constrained latent space. SpaRTeN maps waveforms to states, and learns time-dependent representations of each state. Our key contribution is that we generate clinically relevant representations of each state for intracranial pressure waveforms.

1. INTRODUCTION

Univariate high-frequency time series arise in several domains including economics, medicine, and environmental studies Thorsen-Meyer et al. (2020) ; Shakeel & Srivastava (2021) ; Yao et al. (2019) . High-frequency time series often exhibit different states. For example, in a patient suffering from a stroke, an intracranial pressure (ICP) waveform normally flucturates somewhat. However, a transition to a state where it is persistently elevated may lead to blindness and other neurological problems Mollan et al. (2016) . Early detection of this state transition may enable physicians to intervene appropriately for better outcomesLlwyd et al. (2022) . 2019) or predicting future state transitions. For high-dimensional datasets, unsupervised methods like t-SNE, UMAP, and SOMs can be used to project samples into lower dimensions with spatial relationships Van der Maaten & Hinton (2008) ; McInnes et al. (2018) . However, in time series, dimensionality is proportional to series length. As a result, state determination requires encoding time series into fixed-length vectors, followed by clustering algorithms like kmeans. These methods can capture long-range dependencies but rely on non-differentiable function fitting. Also, these methods are often offline, in that they learn from an entire training dataset at once, before being evaluated and deployed. This can be problematic, especially in the context of dataset shift or high inter-sample variability. Every time a new batch of data is received, the entire model needs to be retrained. High-frequency time series data like waveforms are often encountered in scenarios more suitable for online learning, wherein a learner attempts to tackle some predictive task by learning a sequence of data in the order they are received Hoi et al. (2021) . We propose Spatial Resolved Temporal Networks (SpaRTeN), a composite differentiable unsupervised deep learning network to learn a discrete spatial representation from a high frequency time series via temporal ensemble learning. We show that our method outperforms TFTs with the same number of parameters in benchmarks of online learning tasks. We also note that TFTs can be included within our method due to the flexibility of the composite model.

1.1. CONTRIBUTIONS

• We introduce SpaRTEn, a new framework for online learning of spatial representations from high-frequency time series. • We demonstrate that introduction of a latent space improves rather than harms SpaRTEn's ability to forecast and cluster high frequency data in real-time, compared to state of the art models. • We show that SpaRTeN can generate clinically meaningful representations of medical intracranial pressure waveforms.

2.1. PROBLEM FORMULATION

For the purposes of time series forecasting, we examine the problem of simultaneously learning: 1. a function S : x t → s t , which maps a time series x t of length k, {x t-k , x t-k+1 , . . . , x t }, to a discrete state s t , and 2. a set of functions R st : x t → ŷt which, for each state s t , map the input time series of length k to a forecast time series ŷt = {x t+1 , x t+2 , . . . , x t+w } of length equal to the prediction window w. S and R are optimized to maximize the probability of assigning the time series x t to the most suitable function R (.) as determined by an objective function L: max S min R (.) E[L(R S(xt) (x t ))] where the expectation E[L(.)] is taken over the set of all time series. This corresponds to the minmax framework described for GANs in Goodfellow et al. (2014) .

2.2. MODEL ARCHITECTURE AND THE FORWARD PASS

We implement the above with a composite model architecture depicted in Figure 1 , where S and R are depicted as analogously named blocks. The height a and width b, common to the two blocks, represents the two dimensional discrete state space, which can also be considered to be the latent space for this model. The S block implements convolutional filters (Appendix A.3) to map an input time series x t to a density over the discrete two dimensional space of states (green arrow 1). The R block consists of a spatially arranged ensemble of LSTM sub-networks, each of which makes a forecast for the input x t with a prediction window of w (blue arrow 1). For each input x t , the sub-network in R corresponding to the greatest density output by S (green arrow 2) is used for generating the prediction ŷt (blue arrows 2, 3). We choose to place s t in a two-dimensional discrete state-space (i, j), because it facilitates easy visualization of time series corresponding to individual states, which previous methods like SOM-VAE and TFT are unable to currently do. We can parameterize the number of states by adjusting the width and height of the latent space, a and b. The state space width and height are hyper-parameters that should be adjusted depending on the a priori assumptions of dataset complexity.



Many algorithms like shapelets, hierarchical latent factor models, hidden Markov Model-like methods, change point and anomaly detection techniques, and N-Beats are dedicated towards disentangling time series into their respective subcomponents Li et al. (2021); Grabocka et al. (2015); Oreshkin et al. (2019a); Blazquez-Garcia & Conde (2022); Aminikhanghahi & Cook (2017); Van Den Oord & Vinyals (2017) but few are dedicated towards disentangling states within a single time series Franceschi et al. (

Extraction of states or state-transitions from a high-frequency time series requires online unsupervised representation learning, a relatively understudied field. Fuzzy neural networks create a set of modifiable rules Luo et al. (2019), but successive rule changes makes state inference relatively volatile and inconclusive. Another example of the state-of-the-art time series forecasting method is temporal fusion transformers (TFTs), which can provide interpretable risk prediction via attention mechanisms Lim et al. (2021); Kamal et al.. This method combines feature attention with sequence attention to generate interpretable forecasts, and has shown great promise in time series forecasting.

