PIPS: PATH INTEGRAL STOCHASTIC OPTIMAL CON-TROL FOR PATH SAMPLING IN MOLECULAR DYNAMICS

Abstract

We consider the problem of Sampling Transition Paths: Given two metastable conformational states of a molecular system, e.g. a folded and unfolded protein, we aim to sample the most likely transition path between the two states. Sampling such a transition path is computationally expensive due to the existence of high free energy barriers between the two states. To circumvent this, previous work has focused on simplifying the trajectories to occur along specific molecular descriptors called Collective Variables (CVs). However, finding CVs is non trivial and requires chemical intuition. For larger molecules, where intuition is not sufficient, using these CV-based methods biases the transition along possibly irrelevant dimensions. In this work, we propose a method for sampling transition paths that considers the entire geometry of the molecules. We achieve this by relating the problem to recent works on the Schrödinger bridge problem and stochastic optimal control. Using this relation, we construct a path integral method that incorporates important characteristics of molecular systems such as second-order dynamics and invariance to rotations and translations. We demonstrate our method on commonly studied protein structures like Alanine Dipeptide, and also consider larger proteins such as Polyproline and Chignolin.

1. INTRODUCTION

Modeling non-equilibrium systems in natural sciences involves analyzing dynamical behaviour that occur with very low probability known as rare events, i.e. particular instances of the dynamical system that are atypical. The kinetics of many important molecular processes, such as phase transitions, protein folding, conformational changes, and chemical reactions, are all dominated by these rare events. One way to sample these rare events is to follow the time evolution of the underlying dynamical system using Molecular Dynamic (MD) simulations until a reasonable number of events have been observed. However, this is highly inefficient computationally due to the large time-scales involved in MD simulations, which are typically related to the presence of high energy or entropy barriers between the metastable states. Thus, the main problem is: How can we efficiently sample trajectories between metastable states that give rise to these rare but interesting transition events? Numerous enhanced sampling methods such as steered MD (Jarzynski, 1997), umbrella sampling (Torrie and Valleau, 1977) , constrained MD (Carter et al., 1989) , transition path sampling (Dellago and Bolhuis, 2009), and many more, have been developed to deal with the problem of rare events in molecular simulation. Most of these methods bias the dynamical system with well-chosen geometric descriptors of the transition (analogous to lower dimensional features), called collective variables (CVs), that allow the system to overcome high-energy transition barriers and sample these rare events. The performance of these enhanced sampling techniques is critically dependent on the choice of these CVs. However, choosing appropriate CVs for all but the simplest molecular systems is fraught with difficulty, as it relies on human intuition, insights about the molecular system, and trial and error. A key alternative to sampling these rare transition paths is to model an alternate dynamical system that allows sampling these rare trajectories in an optimal manner (Ahamed et al., 2006; Jack, 2020; Todorov, 2009) or by learning an optimal RL policy for such a transition system Rose et al. (2021) . In this paper, we consider the problem of sampling rare transition paths by developing an alternative dynamical system using path integral stochastic optimal control (Kappen, 2005; 2007; Kappen and Ruiz, 2016; Theodorou et al., 2010) . Our method models this alternative dynamics of the system

