HIDDEN MARKOV MODELS ARE RECURRENT NEURAL NETWORKS: A DISEASE PROGRESSION MODELING AP-PLICATION

Abstract

Hidden Markov models (HMMs) are commonly used for disease progression modeling when the true patient health state is not fully known. Since HMMs may have multiple local optima, performance can be improved by incorporating additional patient covariates to inform estimation. To allow for this, we formulate a special case of recurrent neural networks (RNNs), which we name hidden Markov recurrent neural networks (HMRNNs), and prove that each HMRNN has the same likelihood function as a corresponding discrete-observation HMM. The HMRNN can be combined with any other predictive neural networks that take patient covariate information as input. We show that HMRNN parameter estimates are numerically close to those obtained from via the Baum-Welch algorithm, validating their theoretical equivalence. We then demonstrate how the HMRNN can be combined with other neural networks to improve parameter estimation, using an Alzheimer's disease dataset. The HMRNN's solution improves disease forecasting performance and offers a novel clinical interpretation compared with a standard HMM.

1. INTRODUCTION

Hidden Markov models (HMMs; Baum & Petrie, 1966) are commonly used for modeling disease progression, because they allow researchers to conceptualize complex (and noisy) clinical measurements as originating from a smaller set of latent health states. Each latent health state is characterized by an emission distribution that specifies the probabilities of each measurement/observation given that state. This allows HMMs to explicitly account for uncertainty or measurement error, since the system's true state is not fully observable. Because of their intuitive parameter interpretations and flexibility, HMMs have been used to model biomarker changes in HIV patients (Guihenneuc-Jouyaux et al., 2000) , Alzheimer's disease progression (Liu et al., 2015) , breast cancer screening decisions (Ayer et al., 2012) , and patient response to blood anticoagulants (Nemati et al., 2016) . Researchers may wish to integrate HMMs with other disease progression models and/or data sources. For instance, researchers in Igl et al. ( 2018) jointly trained parameters for an HMM and a reinforcement learning policy to maximize patient returns. Other researchers have attempted to learn or initialize HMM parameters based on additional sources of patient data (Gupta, 2019; Zhou et al., 2019) . Such modifications typically require multiple estimation steps (e.g., Zhou et al., 2019) or changes to parameter interpretation (e.g., Igl et al., 2018) . This is because the standard algorithm for fitting HMMs, the Baum-Welch algorithm (Baum & Petrie, 1966) , maximizes the likelihood of a data sequence without consideration of additional covariates. We introduce Hidden Markov Recurrent Neural Networks (HMRNNs) -neural networks that mimic the computation of hidden Markov models while allowing for substantial modularity with other predictive networks. Unlike past work combining neural networks and HMMs (e.g., Bridle, 1990) , HMRNNs are designed to maximize the most commonly-used HMM fit criterion -the likelihood of the data given the parameters. In doing so, our primary contributions are as follows: (1) We prove how recurrent neural networks (RNNs) can be formulated to optimize the same likelihood function as HMMs, with parameters that can be interpreted as HMM parameters (section 3); (2) We empirically demonstrate that our model yields statistically similar parameter solutions compared with the Baum-1

