LEARNING NEURAL GENERATIVE DYNAMICS FOR MOLECULAR CONFORMATION GENERATION

Abstract

We study how to generate molecule conformations (i.e., 3D structures) from a molecular graph. Traditional methods, such as molecular dynamics, sample conformations via computationally expensive simulations. Recently, machine learning methods have shown great potential by training on a large collection of conformation data. Challenges arise from the limited model capacity for capturing complex distributions of conformations and the difficulty in modeling long-range dependencies between atoms. Inspired by the recent progress in deep generative models, in this paper, we propose a novel probabilistic framework to generate valid and diverse conformations given a molecular graph. We propose a method combining the advantages of both flow-based and energy-based models, enjoying: (1) a high model capacity to estimate the multimodal conformation distribution; (2) explicitly capturing the complex long-range dependencies between atoms in the observation space. Extensive experiments demonstrate the superior performance of the proposed method on several benchmarks, including conformation generation and distance modeling tasks, with a significant improvement over existing generative models for molecular conformation sampling 1 .

1. INTRODUCTION

Recently, we have witnessed the success of graph-based representations for molecular modeling in a variety of tasks such as property prediction (Gilmer et al., 2017) and molecule generation (You et al., 2018; Shi et al., 2020) . However, a more natural and intrinsic representation of a molecule is its 3D structure, commonly known as the molecular geometry or conformation, which represents each atom by its 3D coordinate. The conformation of a molecule determines its biological and physical properties such as charge distribution, steric constraints, as well as interactions with other molecules. Furthermore, large molecules tend to comprise a number of rotatable bonds, which may induce flexible conformation changes and a large number of feasible conformations in nature. Generating valid and stable conformations of a given molecule remains very challenging. Experimentally, such structures are determined by expensive and time-consuming crystallography. Computational approaches based on Markov chain Monte Carlo (MCMC) or molecular dynamics (MD) (De Vivo et al., 2016) are computationally expensive, especially for large molecules (Ballard et al., 2015) . Machine learning methods have recently shown great potential for molecular conformation generation by training on a large collection of data to model the probability distribution of potential conformations R based on the molecular graph G, i.e., p(R|G). For example, Mansimov et al. (2019) proposed a Conditional Variational Graph Autoencoders (CVGAE) for molecular conformation generation. A graph neural network (Gilmer et al., 2017) is first applied to the molecular graph to get the atom representations, based on which 3D coordinates are further generated. One limitation of such an approach is that by directly generating the 3D coordinates of atoms it fails to model the rotational and translational invariance of molecular conformations. To address this issue, instead of generating the 3D coordinates directly, Simm & Hernández-Lobato ( 2020) recently proposed to first model the molecule's distance geometry (i.e., the distances between atoms)-which are rotationally and translationally invariant-and then generate the molecular conformation based on the distance geometry through a post-processing algorithm (Liberti et al., 2014) . Similar to Mansimov et al. ( 2019), a few layers of graph neural networks are applied to the molecular graph to learn the representations of different edges, which are further used to generate the distances of different edges independently. This approach is capable of more often generating valid molecular conformations. Although these new approaches have made tremendous progress, the problem remains very challenging and far from solved. First, each molecule may have multiple stable conformations around a number of states which are thermodynamically stable. In other words, the distribution p(R|G) is very complex and multi-modal. Models with high capacity are required to model such complex distributions. Second, existing approaches usually apply a few layers of graph neural networks to learn the representations of nodes (or edges) and then generate the 3D coordinates (or distances) based on their representations independently. Such approaches are necessarily limited to capturing a single mode of p(R|G) (since the coordinates or distances are sampled independently) and are incapable of modeling multimodal joint distributions and the form of the graph neural net computation makes it difficult to capture long-range dependencies between atoms, especially in large molecules. Inspired by the recent progress with deep generative models, this paper proposes a novel and principled probabilistic framework for molecular geometry generation, which addresses the above two limitations. Our framework combines the advantages of normalizing flows (Dinh et al., 2014) and energy-based approaches (LeCun et al., 2006) , which have a strong model capacity for modeling complex distributions, are flexible to model long-range dependency between atoms, and enjoy efficient sampling and training procedures. Similar to the work of Simm & Hernández-Lobato (2020), we also first learn the distribution of distances d given the graph G, i.e., p(d|G), and define another distribution of conformations R given the distances d, i.e., p(R|d, G). Specifically, we propose a novel Conditional Graph Continuous Flow (CGCF) for distance geometry (d) generation conditioned on the molecular graph G. Given a molecular graph G, CGCF defines an invertible mapping between a base distribution (e.g., a multivariate normal distribution) and the molecular distance geometry, using a virtually infinite number of graph transformation layers on atoms represented by a Neural Ordinary Differential Equations architecture (Chen et al., 2018) . Such an approach enjoys very high flexibility to model complex distributions of distance geometry. Once the molecular distance geometry d is generated, we further generate the 3D coordinates R by searching from the probability p(R|d, G). Though the CGCF has a high capacity for modeling complex distributions, the distances of different edges are still independently updated in the transformations, which limits its capacity for modeling long-range dependency between atoms in the sampling process. Therefore, we further propose another unnormalized probability function, i.e., an energy-based model (EBM) (Hinton & Salakhutdinov, 2006; LeCun et al., 2006; Ngiam et al., 2011) , which acts as a tilting term of the flow-based distribution and directly models the joint distribution of R. Specifically, the EBM trains an energy function E(R, G), which is approximated by a neural network. The flow-and energy-based models are combined in a novel way for joint training and mutual enhancement. First, energy-based methods are usually difficult to train due to the slow sampling process. In addition, the distribution of conformations is usually highly multi-modal, and the sampling procedures based on Gibbs sampling or Langevin Dynamics (Bengio et al., 2013a; b) tend to get trapped around modes, making it difficult to mix between different modes (Bengio et al., 2013a) . Here we use the flow-based model as a proposal distribution for the energy model, which is capable to generate diverse samples for training energy models. Second, the flow-based model lacks the capacity to explicitly model the long-range dependencies between atoms, which we find can however be effectively modeled by an energy function E(R, G). Our sampling process can be therefore viewed as a two-stage dynamic system, where we first take the flow-based model to quickly synthesize realistic conformations and then used the learned energy E(R, G) to refine the generated conformations through Langevin Dynamics.

funding

contribution. Work was done during Shitong's internship at Mila.

