CRITICAL SAMPLING FOR ROBUST EVOLUTION BE-HAVIOR LEARNING OF UNKNOWN DYNAMICAL SYS-TEMS

Abstract

We study the following new and important problem: given an unknown dynamical system, what is the minimum number of samples needed for effective learning of its governing laws and accurate prediction of its future evolution behavior, and how to select these critical samples? In this work, we propose to explore this problem based on a design approach. Specifically, starting from a small initial set of samples, we adaptively discover and collect critical samples to achieve increasingly accurate learning of the system evolution. One central challenge here is that we do not know the network modeling error of the ground-truth system state, which is however needed for critical sampling. To address this challenge, we introduce a multi-step reciprocal prediction network where a forward evolution network and a backward evolution network are designed to learn and predict the temporal evolution behavior in the forward and backward time directions, respectively. Very interestingly, we find that the desired network modeling error is highly correlated with the multi-step reciprocal prediction error. More importantly, this multi-step reciprocal prediction error can be directly computed from the current system state without knowing the ground-truth or data statistics. This allows us to perform a dynamic selection of critical samples from regions with high network modeling errors and develop an adaptive sampling-learning method for dynamical systems. To achieve accurate and robust learning from this small set of critical samples, we introduce a joint spatial-temporal evolution network which incorporates spatial dynamics modeling into the temporal evolution prediction for robust learning of the system evolution operator with few samples. Our extensive experimental results demonstrate that our proposed method is able to dramatically reduce the number of samples needed for effective learning and accurate prediction of evolution behaviors of unknown dynamical systems by up to hundreds of times, especially for high-dimensional dynamical systems.

1. INTRODUCTION

Recently, learning-based methods for complex and dynamic system modeling have become an important area of research in machine learning. The behaviors of dynamical systems in the physical world are governed by their underlying physical laws (Bongard & Lipson, 2007; Schmidt & Lipson, 2009) . In many areas of science and engineering, ordinary differential equations (ODEs) and partial differential equations (PDEs) play important roles in describing and modeling these physical laws (Brunton et al., 2016; Raissi, 2018; Long et al., 2018; Chen et al., 2018; Raissi et al., 2019; Qin et al., 2019) . In recent years, data-driven modeling of unknown physical systems from measurement data has emerged as an important area of research. There are two major approaches that have been explored. The first approach typically tries to identify all the potential terms in the unknown governing equations from a priori dictionary, which includes all possible terms that may appear in the equations (Brunton et al., 2016; Schaeffer & McCalla, 2017; Rudy et al., 2017; Raissi, 2018; Long et al., 2018; Wu & Xiu, 2019; Wu et al., 2020; Xu & Zhang, 2021) . The second approach for data-driven learning of unknown dynamical systems is to approximate the evolution operator of the underlying equations, instead of identifying the terms in the equations (Qin et al., 2019; Wu & Xiu, 2020; Qin et al., 2021a; Li et al., 2021b) . Figure 1 : Illustration of the proposed method of critical sampling for accurately learning the evolution behaviors of unknown dynamical systems. Many existing data-driven approaches for learning the evolution operator typically assume the availability of sufficient data, and often require a large set of measurement samples to train the neural network, especially for high-dimensional systems. For example, to effectively learn a neural network model for the 2D Damped Pendulum ODE system, existing methods typically need more than 10000 samples to achieve sufficient accuracy (Qin et al., 2019; Wu & Xiu, 2020) . This number increases dramatically with the dimensions of the system. For example, for the 3D Lorenz system, the number of needed samples used in the literature is often increased to one million. We recognize that, in practical dynamical systems, such as ocean, cardiovascular and climate systems, it is very costly to collect observation samples. This leads to a new and important research question: what is the minimum number of samples needed for robust learning of the laws of an unknown system and accurate prediction of its future evolution behavior? Adaptive sample selection for network learning, system modeling and identification has been studied in the areas of active learning and optimal experimental design. Methods have been developed for global optimization of experimental sequences (Llamosi et al., 2014) , active data sample generation for time-series learning and modeling (Zimmer et al., 2018) , Kriging-based sampling method for learning spatio-temporal dynamics of systems (Huang et al., 2022) , adaptive training of physicsinformed deep neural networks (Zhang & Shafieezadeh, 2022), and data-collection scheme for system identification (Mania et al., 2022) . However, within the context of deep neural network modeling of unknown dynamical systems, the following key challenging issues have not been adequately addressed: (1) how to characterize and estimate the prediction error of the deep neural networks? (2) Based on this error modeling, how to adaptively select the critical samples and successfully train the deep neural networks from these few samples? Figure 1 illustrates the proposed method of critical sampling for accurately learning the evolution behaviors of unknown dynamical systems. We start with a small set of initial samples, then iteratively discover and collect critical samples to obtain more accurate network modeling of the system. During critical sampling, the basic rule is to select the samples from regions with high network modeling errors so that these selected critical samples can maximally reduce the overall modeling error. However, the major challenge here is that we do not know network modeling error, i.e., the difference between the system state predicted by the network and the ground-truth which is not available for unknown systems. To address this challenge, we establish a multi-step reciprocal prediction framework where a forward evolution network and a backward evolution network are designed to learn and predict the temporal evolution behavior in the forward and backward time directions, respectively. Our hypothesis is that, if the forward and backward prediction models are both accurate, starting from an original state A, if we perform the forward prediction for K times and then perform the backward prediction for another K times, the final prediction result Ā should match the original state A. The error between Ā and A is referred to as the multi-step reciprocal prediction error. Very interestingly, we find that the network modeling error is highly correlated with the multi-step reciprocal prediction error. Note that multi-step reciprocal prediction error can be directly computed from the current system state, without the need to know the ground-truth system state. This allows us to perform a dynamic selection of critical samples from regions with high network modeling errors and develop an adaptive learning method for dynamical systems. To effectively learn the system evolution from this small set of critical samples, we introduce a joint spatial-temporal evolution network structure which couples spatial dynamics learning with temporal evolution learning. Our extensive experimental results demonstrate that our proposed method is able to dramatically reduce the number of samples needed for effective learning and accurate prediction of evolution behaviors of unknown dynamical systems. This paper has significant impacts in practice since collecting samples from real-world dynamical systems can be very costly or limited due to resource/labor constraints or experimental accessibility.

