EQUIVARIANT ENERGY-GUIDED SDE FOR INVERSE MOLECULAR DESIGN

Abstract

Inverse molecular design is critical in material science and drug discovery, where the generated molecules should satisfy certain desirable properties. In this paper, we propose equivariant energy-guided stochastic differential equations (EEGSDE), a flexible framework for controllable 3D molecule generation under the guidance of an energy function in diffusion models. Formally, we show that EEGSDE naturally exploits the geometric symmetry in 3D molecular conformation, as long as the energy function is invariant to orthogonal transformations. Empirically, under the guidance of designed energy functions, EEGSDE significantly improves the baseline on QM9, in inverse molecular design targeted to quantum properties and molecular structures. Furthermore, EEGSDE is able to generate molecules with multiple target properties by combining the corresponding energy functions linearly.

1. INTRODUCTION

The discovery of new molecules with desired properties is critical in many fields, such as the drug and material design (Hajduk & Greer, 2007; Mandal et al., 2009; Kang et al., 2006; Pyzer-Knapp et al., 2015) . However, brute-force search in the overwhelming molecular space is extremely challenging. Recently, inverse molecular design (Zunger, 2018) provides an efficient way to explore the molecular space, which directly predicts promising molecules that exhibit desired properties. A natural way of inverse molecular design is to train a conditional generative model (Sanchez-Lengeling & Aspuru-Guzik, 2018) . Formally, it learns a distribution of molecules conditioned on certain properties from data, and new molecules are predicted by sampling from the distribution with the condition set to desired properties. Among them, equivariant diffusion models (EDM) (Hoogeboom et al., 2022) leverage the current state-of-art diffusion models (Ho et al., 2020) , which involves a forward process to perturb data and a reverse process to generate 3D molecules conditionally or unconditionally. While EDM generates stable and valid 3D molecules, we argue that a single conditional generative model is insufficient for generating accurate molecules that exhibit desired properties (see Table 1 and Table 3 for an empirical verification). In this work, we propose equivariant energy-guided stochastic differential equations (EEGSDE), a flexible framework for controllable 3D molecule generation under the guidance of an energy function in diffusion models. EEGSDE formalizes the generation process as an equivariant stochastic differential equation, and plugs in energy functions to improve the controllability of generation. Formally, we show that EEGSDE naturally exploits the geometric symmetry in 3D molecular conformation, as long as the energy function is invariant to orthogonal transformations. We apply EEGSDE to various applications by carefully designing task-specific energy functions. When targeted to quantum properties, EEGSDE is able to generate more accurate molecules than As the energy function is invariant to rotational transformation R, its gradient (i.e., the energy guidance) is equivariant to R, and therefore the distribution of generated samples is invariant to R. EDM, e.g., reducing the mean absolute error by more than 30% on the dipole moment property. When targeted to specific molecular structures, EEGSDE better capture the structure information in molecules than EDM, e.g, improving the similarity to target structures by more than 10%. Furthermore, EEGSDE is able to generate molecules targeted to multiple properties by combining the corresponding energy functions linearly. These demonstrate that our EEGSDE enables a flexible and controllable generation of molecules, providing a smart way to explore the chemical space. (2020) . There are also variants proposed to improve or accelerate diffusion models (Nichol & Dhariwal, 2021; Vahdat et al., 2021; Dockhorn et al., 2021; Bao et al., 2022b; a; Salimans & Ho, 2022; Lu et al., 2022) .

2. RELATED WORK

Guidance is a technique to control the generation process of diffusion models. 1988) and 2D graphs of molecules. These include variational autoencoders (Kusner et al., 2017; Dai et al., 2018; Jin et al., 2018; Simonovsky & Komodakis, 2018; Liu et al., 2018) , normalizing flows (Madhawa et al., 2019; Zang & Wang, 2020; Luo et al., 2021) , generative adversarial networks (Bian et al., 2019; Assouel et al., 2018) , and autoregressive models (Popova et al., 2019; Flam-Shepherd et al., 2021) . There are also methods on generating torsion angles in molecules. For



Figure 1: Overview of our EEGSDE. EEGSDE iteratively generates molecules with desired properties (represented by the condition c) by adopting the guidance of energy functions in each step.As the energy function is invariant to rotational transformation R, its gradient (i.e., the energy guidance) is equivariant to R, and therefore the distribution of generated samples is invariant to R.

Diffusion models are initially proposed bySohl-Dickstein et al. (2015). Recently, they are better understood in theory by connecting it to score matching and stochastic differential equations (SDE)(Ho et al., 2020; Song et al., 2020). After that, diffusion models have shown strong empirical performance in many applications Dhariwal & Nichol (2021); Ramesh et al. (2022); Chen et al. (2020); Kong et al.

Initially, Song et al.  (2020);Dhariwal & Nichol (2021)  use classifier guidance to generate samples belonging to a class. Then, the guidance is extended to CLIP(Radford et al., 2021)  for text to image generation, and semantic-aware energy(Zhao et al., 2022)  for image-to-image translation. Prior guidance methods focus on image data, and are nontrivial to apply to molecules, since they do not consider the geometric symmetry. In contrast, our work proposes a general guidance framework for 3D molecules, where an invariant energy function is employed to leverage the geometric symmetry of molecules.

