LEARNING INSTANCE-SOLUTION OPERATOR FOR OP-TIMAL CONTROL

Abstract

Optimal control problems (OCPs) involves finding a control function for a dynamical system such that a cost functional is optimized, which are central to physical system research in both academia and industry. In this paper, we propose a novel instance-solution operator learning perspective, which solves OCPs in a one-shot manner with no dependence on the explicit expression of dynamics or iterative optimization processes. The design is in principle endowed with substantial speedup in running time, and the model reusability is guaranteed by high-quality in-and out-of-distribution generalization. We theoretically validate the perspective by presenting the approximation bounds for the instance-solution operator learning. Experiments on 7 synthetic environments and a real-world dataset verify the effectiveness and efficiency of our approach. The source code will be made publicly available.

1. INTRODUCTION

The explosion of data for embedding the physical world is reshaping the ways we understand, model, and control dynamical systems. Though control theory has been classically rooted in a model-based design and solving paradigm, the demands of model reusability, and the opacity of complex dynamical systems call for a rapprochement of modern control theory, machine learning, and optimization. Recent years have witnessed emerging trends of control theories with successful applications to engineering and scientific research, such as robotics (Krimsky & Collins, 2020), aerospace technology (He et al., 2019) , and economics and management (Lapin et al., 2019) etc. We consider the well-established formulation of optimal control (Kirk, 2004) in finite time horizon T = [t 0 , t f ]. Denote X and U as two vector-valued function sets, representing state functions and control functions respectively. Functions in X (resp. U ) are defined over T and have their outputs in R dx (resp. R du ). State functions x ∈ X and control functions u ∈ U are governed by a differential equation. The optimal control problem (OCP) is targeted at finding a control function that minimizes the cost functional f (Lions, 1992; Kirk, 2004; Vinter & Vinter, 2010; Lewis et al., 2012) : min u∈U f (x, u) = t f t0 p(x(t), u(t)) dt + h(x(t f )) (1a) s.t. ẋ(t) = d(x(t), u(t)), (1b) x(t 0 ) = x 0 , ( ) where d is the dynamics of differential equations; p evaluates the cost alongside the dynamics and h evaluates the cost at the termination state x(t f ); and x 0 is the initial state. We restrict our discussion to differential equation-governed optimal control problems, leaving the control problems in stochastic networks (Dai & Gluzman, 2022), inventory management (Abdolazimi et al., 2021) , etc. out of the scope of this paper. The analytic solution of Eq. 1 is usually unavailable, especially for complex dynamical systems. Thus, there has been a wealth of research towards accurate, efficient, and scalable numerical OCP solvers (Rao, 2009) and neural network based solvers (Kiumarsi et al., 2017) in recent years. However, both classic and modern numerical OCP solvers are facing challenges, especially emerging in the big data era, which we briefly discuss as follows. 1) Opacity of Dynamical Systems. Existing works (Böhme & Frank, 2017a; Effati & Pakdaman, 2013; Jin et al., 2020) assume the dynamical systems a priori and exploit their explicit forms to ease

