CONTINUOUS-TIME IDENTIFICATION OF DYNAMIC STATE-SPACE MODELS BY DEEP SUBSPACE ENCODING

Abstract

Continuous-time (CT) modeling has proven to provide improved sample efficiency and interpretability in learning the dynamical behavior of physical systems compared to discrete-time (DT) models. However, even with numerous recent developments, the CT nonlinear state-space (NL-SS) model identification problem remains to be solved in full, considering common experimental aspects such as the presence of external inputs, measurement noise, latent states, and general robustness. This paper presents a novel estimation method that addresses all these aspects and that can obtain state-of-the-art results on multiple benchmarks with compact fully connected neural networks capturing the CT dynamics. The proposed estimation method called the subspace encoder approach (SUBNET) ascertains these results by efficiently approximating the complete simulation loss by evaluating short simulations on subsections of the data, by using an encoder function to estimate the initial state for each subsection and a novel state-derivative normalization to ensure stability and good numerical conditioning of the training process. We prove that the use of subsections increases cost function smoothness together with the necessary requirements for the existence of the encoder function and we show that the proposed state-derivative normalization is essential for reliable estimation of CT NL-SS models.

1. INTRODUCTION

Dynamical systems described by nonlinear state-space models with a state vector x(t) ∈ R nx are powerful tools of many modern sciences and engineering disciplines to understand potentially complex dynamical systems. One can distinguish between Discrete-Time (DT) x k+1 = f (x k , u k ) and Continuous-Time (CT) dx(t) dt = f (x(t), u(t)) state-space models. In general, obtaining DT dynamical models from data is easier than CT models since data in computers is represented as discrete elements (e.g. arrays). However, the additional implementation complexity and computational costs associated with identifying CT models can be justified in many cases. First and foremost, from the natural sciences, we know that many systems are compactly described by CT dynamics which makes the continuity prior of CT models a well-motivated regularization/prior (De Brouwer et al., 2019) . It has been observed that this regularization can be beneficial for sample efficiency (De Brouwer et al., 2019) which is a common observation when "including physics" in learning approaches (Karniadakis et al., 2021) . Furthermore, the analysis of ODE equations is a well-regarded field of study with many powerful results and methods which could further improve model interpretability (Fan et al., 2021) , such as applied in Bai et al. (2019) . Another inherent advantage is that these models can be used with irregularly sampled or missing data (Rudy et al., 2019) . Lastly, in the control community, CT models are generally regarded desirable for control synthesis tasks as shaping the behavior of the controller is much more intuitive in CT (Garcia et al., 1989) . Hence, developing robust and general CT models and estimation methods would be greatly beneficial. In the identification of physical CT systems, it is common to encounter challenges such as: external inputs (u(t)), noisy measurements, latent states, unknown measurement function/distribution (e.g. y(t) = h(x(t))), the need for accurate long-term predictions and a need for a sufficiently low computational cost. For instance, all these aspects need to be considered for the cascade tank benchmark In contrast to previous work, we present a CT encoder-based method which is a general, robust and well-performing estimation method for CT state-space model identification. That is, the formulation addresses noise assumptions, external inputs, latent states, an unknown output function, and provides state-of-the-art results on multiple benchmarks of real systems. The presented subspace encoder method is summarized in Figure 2 . The proposed method considers a cost function evaluations on only short subsections of the available dataset which reduces the computational complexity. Furthermore, we show theoretically that considering subsections enhances cost function smoothness and thus optimization stability. The initial states of these subsections are estimated using the encoder function for which we present necessary requirements for its existence. Lastly, we introduce a normalization of the state and state-derivative and we show that it is required for proper CT estimation. Moreover, we attain additional novelty as these results are obtained without needing to impose a specific structure on the state-space (such as in Greydanus et al. (2019) ; Cranmer et al. ( 2020)) obtaining a practically widely applicable method. Our main contributions are the following; • We formally derive the problem of CT state-space model estimation with latent states, external inputs, and measurement noise. • We reduce the computational loads by proposing a subspace encoder-based identification algorithm that employs short subsections, an encoder function that estimates the initial latent states of these subsections, and a state-derivative normalization term for robustness. • We make multiple theoretical contributions; (i) we prove that the use of short subsections increases cost function smoothness by Lipschitz continuity analysis, (ii) we derive necessary conditions for the encoder function to exist and (iii) we show that a state-derivative normalization term is required for proper CT model estimation. • We demonstrate that the proposed estimation method obtains state-of-the-art results on multiple benchmarks.

2. RELATED WORK

One of the most influential papers in CT model estimation is the introduction of neural ODEs (Chen et al., 2018) , which showed that residual networks is presented as an Euler discretization of a continuous in-depth neural network. Moreover, they also show that one can employ numerical integrators



In this work, we consider the problem of estimating continuous-time (CT) state-space models from noisy observation (additive noise) with long-term prediction capabilities, hidden states and external signals in a computationally efficient and robust manner.problem(Schoukens & Noël, 2017). These aspects and the considered CT state-space model is summarized in Figure1. Many of these aspects have been studied independently, for instance, Brajard et al. (2020); Rudy et al. (2019) explicitly addressed the presence of noise on the measurement data, Maulik et al. (2020); Chen et al. (2018) provided methods for modeling dynamics with latent states, Zhong et al. (2020) considers the presence of known external inputs, Zhou et al. (2021a) provides a computationally tractable method for accurate long-term sequence modeling. However, formulating models and estimation methods for the combination of multiple or all aspects is in comparison underdeveloped with only a few attempts such as Forgione & Piga (2021a) that have been made.

