OOD-CONTROL: OUT-OF-DISTRIBUTION GENERAL-IZATION FOR ADAPTIVE UAV FLIGHT CONTROL

Abstract

Data-driven control methods have demonstrated precise and agile control of Unmanned Aerial Vehicles (UAVs) over turbulence environments. However, they are relatively weak at taming the out-of-distribution (OoD) data, i.e., encountering the generalization problem when faced with unknown environments with different data distributions from the training set. Many studies have designed algorithms to reduce the impact of the OoD problem, a common but tricky problem in machine learning. To tackle the OoD generalization problem in control, we propose a theoretically guaranteed approach: OoD-Control. We provide proof that for any perturbation within some range on the states, the control error can be upper bounded by a constant. In this paper, we present our OoD-Control generalization algorithm for online adaptive flight control and execute it in two instances. Experiments show that systems trained by the proposed OoD-Control algorithm perform better in quite different environments from training. And the control method is extensible and pervasively applicable and can be applied to different dynamical models. OoD-Control is validated on UAV dynamic models, and we find it performs state-of-the-art in positioning stability and trajectory tracking problems.

1. INTRODUCTION

UAVs have gained considerable attention and are widely used for various purposes because of their high manoeuvrability and flexibility. For example, quadrotors are widely deployed for inspection, reconnaissance, and rescue. As control strategies evolve, novel scenarios for UAVs, such as aerial grasping, transporting, and bridge inspection (Ruggiero et al., 2018) , require more precise trajectory tracking. Especially in the outdoor environment, unpredictable and changing wind field conditions pose substantial challenges to the stability of UAVs. Rotor blades are affected by induced airflow caused by the wind, which creates complex and non-stationary aerodynamic interactions (see Appendix B.6.3) . From security and policy perspectives, demonstrating that UAVs can operate safely and reliably in unpredictable environments with various distributions is an essential requirement. It is also the premise for future medical robots, autonomous cars, and manned aerial vehicles to be widely accepted. Many areas have benefited from data-driven approaches. However, they are susceptible to performance degradation after generalization. And the majority of deep learning algorithms heavily rely on the I.I.D assumption for data, which is generally violated in practice due to domain generalization (Zhou et al., 2022) . Nevertheless, neural networks may lose their robustness when confronted with OoD data. Many cases of failure in DNN originate from shortcut learning in the learning process (Geirhos et al., 2020) . The damage to the UAV is undoubtedly considerable if the UAV cannot adjust to the changing environment, i.e., it is unstable or even crashes in an OoD situation. One significant objective of this paper is to propose a control algorithm to enable UAVs to maintain accurate control even in the case of environment domain shifts. Our Contributions. UAVs interact with the changing environment, resulting in complex environment-dependent uncertain aerodynamics, called unknown dynamics, that are tricky to model and significantly impact precise control. Previous data-driven controllers attempt to solve the problem by estimating the unknown dynamics, while the estimation accuracy and the performance of the controllers are limited by the environment domain shifts in tests. This paper presents a methodology for adaptive flight control problems, focusing on enabling UAVs to fly under unknown environments. Compared with previous works, our proposed OoD-Control algorithm can provide performance guarantees under domain shifts of the environment distribution. Compared with previous state-of-the-art work (Shi et al., 2021) , the proposed OoD-Control method does not require strong assumptions, for example, e-ISS stability and a fully actuated system. Additionally, our algorithm has a greater capacity for generalization. For different distributions of the environment, we show theoretically that the bound on the prediction error of the unknown dynamics remains constant over a certain range of perturbations. Besides, simulated results under challenging aerodynamic conditions indicate that the OoD-Control algorithm achieves better control performance than the SOTA deep learning algorithms. Today, artificial intelligence triggered a new wave of research in many fields (Jumper et al., 2021; Silver et al., 2017) . The data-driven control methods can directly learn the corresponding control strategy from the interaction process of the controlled system so that it can adapt to new environments. Bansal et al. (2016) validates their proposed deep learning algorithm on a quadrotor testbed. On the other hand, reinforcement learning is a model-free algorithm widely used for control problems. Koch et al. (2019) present an intelligent high-precision flight control system using the reinforcement learning algorithm for UAVs. Moreover, the performance and accuracy of the internal control loop for quadrotor attitude control are analyzed and compared. Results indicate that the neural network has good generalization abilities and can learn the quadrotor dynamics accurately and apply them to the control system. Underwood & Husain (2010) propose an online parameter estimation, and the experimental results validate the effectiveness of the adaptive control method. O'Connell et al. ( 2022) have combined online adaptive learning with representation learning and adapted a DNN to learn a nonlinear representation. However, the environment's diversity is not considered in this work. Adapting to an environment completely different from the training set is challenging. Inspired by Shi et al. (2021) , mechanical-based models with learnable dynamics and DNNs are constructed in this study for their interpretability and stability. We further investigate in this paper whether the robustness of the algorithm can be improved with OoD generalization methods.

2.2. OUT-OF-DISTRIBUTION GENERALIZATION

Out-of-Distribution (OoD) generalization, which involves generalizing under data distribution domain shifts, is an active research area in the community. Generalizing a prediction model under distribution shifts is the process of generalizing its performance. Many algorithms have been proposed to achieve the OoD generalization, including meta-learning (Li et al., 2019; Zhang et al., 2020 ), prototypical learning (Dubey et al., 2021 ), gradient alignment (Rame et al., 2022) , domain adversarial learning (Akuzawa et al., 2019; Xu et al., 2020) and kernel methods (Li et al., 2018; Ghifary et al., 2016) etc. Literature has extensively discussed how to deal with domain shift, and the OoD generalization problem is extensively studied in computer vision (Hsu et al., 2020) , natural language processing (Hendrycks et al., 2020 ), speech recognition (Shankar et al., 2018) , and other fields, but seldom in the context of online control. In Shi et al. (2021) , a multi-task learning method for nonlinear systems was presented that can withstand disturbances and unknown environments. Previous studies have suffered from shortcomings in lacking a discussion about the misspecification of dynamics systems and neglecting the gap between the simulation experiments and reality.



FLIGHT CONTROL ALGORITHMS UAVs have found broad applicability in a variety of fields and have attracted the attention of several researchers. Many published studies describe the significance and efficiency of flight control algorithms, including PID Control (Szafranski & Czyba, 2011), LQR Control (Priyambodo et al., 2020), Sliding Mode Control (Chen et al., 2016), Backstepping Control (Labbadi & Cherkaoui, 2019),Robust Control (Hasseni & Abdou, 2021), etc. However, most of the previously mentioned control methods suffer from limitations. Imprecise system modelling and non-modelled environmental disturbances may result in unacceptable performance or instability.

