SAFETY VERIFICATION OF MODEL BASED REINFORCEMENT LEARNING CONTROLLERS

Abstract

Model-based reinforcement learning (RL) has emerged as a promising tool for developing controllers for real world systems (e.g., robotics, autonomous driving, etc.). However, real systems often have constraints imposed on their state space which must be satisfied to ensure the safety of the system and its environment. Developing a verification tool for RL algorithms is challenging because the nonlinear structure of neural networks impedes analytical verification of such models or controllers. To this end, we present a novel safety verification framework for model-based RL controllers using reachable set analysis. The proposed framework can efficiently handle models and controllers which are represented using neural networks. Additionally, if a controller fails to satisfy the safety constraints in general, the proposed framework can also be used to identify the subset of initial states from which the controller can be safely executed.

1. INTRODUCTION

One of the primary reasons for the growing application of reinforcement learning (RL) algorithms in developing optimal controllers is that RL does not assume a priori knowledge of the system dynamics. Model-based RL explicitly learns a model of the system dynamics, from observed samples of state transitions. This learnt model is used along with a planning algorithm to develop optimal controllers for different tasks. Thus, any uncertainties in the system, including environment noise, friction, air-drag etc., can also be captured by the modeled dynamics. However, the performance of the controller is directly related to how accurately the learnt model represents the true system dynamics. Due to the discrepancy between the learnt model and the true model, the developed controller can behave unexpectedly when deployed on the real physical system, e.g., land robots, UAVs, etc. (Benbrahim & Franklin, 1997; Endo et al., 2008; Morimoto & Doya, 2001) . This unexpected behavior may result in the violation of constraints imposed on the system, thereby violating its safety requirements (Moldovan & Abbeel, 2012) . Thus, it is necessary to have a framework which can ensure that the controller will satisfy the safety constraints before it is deployed on a real system. This raises the primary question of interest: Given a set of safety constraints imposed on the state space, how do we determine whether a given controller is safe or not? In the literature, there have been several works that focus on the problem of ensuring safety. Most of these works incorporate safety constraints in the learning phase to train a controller (policy) to satisfy certain desired specifications or constraints. However, to achieve this goal, some works make strict assumptions on the complete or accurate knowledge of the system dynamics (Zheng & Ratliff, 2020; Hasanbeig et al., 2020) which can be difficult to obtain. Further, to incorporate safety during learning, some works approximate the original problem to represent safety constraints in a tractable form (Fu et al., 2018; Avni et al., 2019) , which reduces the performance of the final trained controller (Fu et al., 2018; Eriksson & Dimitrakakis, 2019; Junges et al., 2016; Könighofer et al., 2020) . On the other hand, some of the works aim at finding a safe controller, under the assumption of a known baseline safe policy (Hans et al., 2008; Garcia & Fernández, 2012; Berkenkamp et al., 2017; Thomas et al., 2015; Laroche et al., 2019; Zheng & Ratliff, 2020) , or several known safe policies (Perkins & Barto, 2002) . However, such safe policies may not be readily available in general. Alternatively, Akametalu et al. (2014) used reachability analysis to develop safe model-based controllers, under the 1

