LEARNING SYSTEM DYNAMICS FROM SENSORY INPUT UNDER OPTIMAL CONTROL PRINCIPLES

Abstract

Identifying the underlying dynamics of actuated physical systems from sensory input is of high interest in control, robotics, and engineering in general. In the context of control problems, existing approaches decouple the construction of the feature space where the dynamics identification process occurs from the target control tasks, potentially leading to a mismatch between feature and real state spaces: the systems may not be controllable in feature space, and synthesized controls may not be applicable in the state space. Borrowing from the Koopman formalism, we propose instead to learn an embedding of both the states and controls in feature spaces where the dynamics are linear, and to include the target control task in the learning objective in the form of a differentiable and robust optimal control problem. We validate this approach with simulation experiments of systems with non-linear dynamics, demonstrating that the controls obtained in feature space can be used to drive the corresponding physical systems and that the learned model can serve for future state prediction.

1. INTRODUCTION

The study of dynamical systems is a key element in understanding most physical phenomena. Such systems are ruled by ordinary differential equations of state variables that contain enough information to describe and determine their behavior, and analytical models of these systems are traditionally derived as solutions of the differential equations in question. However, it is hard to fully model mathematically most real-life phenomena for several reasons: they may have very complex dynamics with complex and constantly changing interactions with the environment, and the state of the physical systems involved may be unknown or not fully observable. On the other hand, the physical systems themselves, if not their internal states, can be observed with sensory data providing implicit information about the underlying (and unknown) states. Thus, leveraging measurement data is natural, and is actually done in a wide range of approaches which build representations of systems from past measurements in the form of feature spaces (Brunton et al., 2016b; Arbabi et al., 2018; Bruder et al., 2019; Brunton et al., 2021) . These models are of high practical interest since they enable compact representations compared to the density of measurements (e.g., when measurements are images). They also enable lifting the state of the system to a higher dimensional space where predictive models can be built. However, even when effective, these estimated models and feature spaces remain highly uninterpretable, and using them in solving control problems remains challenging. Linear models on the contrary are easily interpretable, and enable exact and effective control when coupled with LQR solvers. In particular, the Koopman operator theory (Koopman, 1931) has gained a lot of interest recently (Proctor et al., 2016; Brunton et al., 2016b; Abraham et al., 2017; Morton et al., 2018; Korda & Mezić, 2018; Arbabi et al., 2018; Brunton et al., 2021) . It guarantees the existence of a linear (if typically infinite-dimensional) representation of the dynamics of the observables (vector-valued functions) defined over the state space. Finite dimensional approximations have been proposed, and dynamic mode decomposition (DMD) (Schmid, 2010) is of particular interest in this context. In DMD, an approximation of the Perron-Frobenius operator, adjoint to the Koopman operator, is constructed in the form of a matrix that ensures the transition from one observation to the next. Proctor et al. (2016) first extended the use of DMD to actuated systems and modeled the system dynamics as a linear function of the state representation and the control. Several works have built upon this approximation (Morton et al., 2018; Li et al., 2020) and various methods for estimating the corresponding operators have been proposed (Morton et al., 

