CONTINUOUS-TIME IDENTIFICATION OF DYNAMIC STATE-SPACE MODELS BY DEEP SUBSPACE ENCODING

Abstract

Continuous-time (CT) modeling has proven to provide improved sample efficiency and interpretability in learning the dynamical behavior of physical systems compared to discrete-time (DT) models. However, even with numerous recent developments, the CT nonlinear state-space (NL-SS) model identification problem remains to be solved in full, considering common experimental aspects such as the presence of external inputs, measurement noise, latent states, and general robustness. This paper presents a novel estimation method that addresses all these aspects and that can obtain state-of-the-art results on multiple benchmarks with compact fully connected neural networks capturing the CT dynamics. The proposed estimation method called the subspace encoder approach (SUBNET) ascertains these results by efficiently approximating the complete simulation loss by evaluating short simulations on subsections of the data, by using an encoder function to estimate the initial state for each subsection and a novel state-derivative normalization to ensure stability and good numerical conditioning of the training process. We prove that the use of subsections increases cost function smoothness together with the necessary requirements for the existence of the encoder function and we show that the proposed state-derivative normalization is essential for reliable estimation of CT NL-SS models.

1. INTRODUCTION

Dynamical systems described by nonlinear state-space models with a state vector x(t) ∈ R nx are powerful tools of many modern sciences and engineering disciplines to understand potentially complex dynamical systems. One can distinguish between Discrete-Time (DT) x k+1 = f (x k , u k ) and Continuous-Time (CT) dx(t) dt = f (x(t), u(t)) state-space models. In general, obtaining DT dynamical models from data is easier than CT models since data in computers is represented as discrete elements (e.g. arrays). However, the additional implementation complexity and computational costs associated with identifying CT models can be justified in many cases. First and foremost, from the natural sciences, we know that many systems are compactly described by CT dynamics which makes the continuity prior of CT models a well-motivated regularization/prior (De Brouwer et al., 2019) . It has been observed that this regularization can be beneficial for sample efficiency (De Brouwer et al., 2019) which is a common observation when "including physics" in learning approaches (Karniadakis et al., 2021) . Furthermore, the analysis of ODE equations is a well-regarded field of study with many powerful results and methods which could further improve model interpretability (Fan et al., 2021) , such as applied in Bai et al. (2019) . Another inherent advantage is that these models can be used with irregularly sampled or missing data (Rudy et al., 2019) . Lastly, in the control community, CT models are generally regarded desirable for control synthesis tasks as shaping the behavior of the controller is much more intuitive in CT (Garcia et al., 1989) . Hence, developing robust and general CT models and estimation methods would be greatly beneficial. In the identification of physical CT systems, it is common to encounter challenges such as: external inputs (u(t)), noisy measurements, latent states, unknown measurement function/distribution (e.g. y(t) = h(x(t))), the need for accurate long-term predictions and a need for a sufficiently low computational cost. For instance, all these aspects need to be considered for the cascade tank benchmark

