EXPLORING CHEMICAL SPACE WITH SCORE-BASED OUT-OF-DISTRIBUTION GENERATION

Abstract

A well-known limitation of existing molecular generative models is that the generated molecules highly resemble those in the training set. To generate truly novel molecules with completely different structures that may have even better properties than known molecules for de novo drug discovery, more powerful exploration in the chemical space is necessary. To this end, we propose Molecular Out-Of-distribution Diffusion (MOOD), a novel score-based diffusion scheme that incorporates out-ofdistribution (OOD) control in the generative stochastic differential equation (SDE) with simple control of a hyperparameter, thus requires no additional computational costs unlike existing methods (e.g., RL-based methods). However, some novel molecules may be chemically implausible, or may not meet the basic requirements of real-world drugs. Thus, MOOD performs conditional generation by utilizing the gradients from a property prediction network that guides the reverse-time diffusion process to high-scoring regions according to multiple target properties such as protein-ligand interactions, drug-likeness, and synthesizability. This allows MOOD to search for novel and meaningful molecules rather than generating unseen yet trivial ones. We experimentally validate that MOOD is able to explore the chemical space beyond the training distribution, generating molecules that outscore ones found with existing methods, and even the top 0.01% of the original training pool.

1. INTRODUCTION

Finding novel molecules with desired chemical properties is the primary goal of drug discovery. However, the chemical space is vast, and it is infeasible to examine all possible molecules to find those satisfying a target molecule profile. Recently, deep molecule generation models that can automatically generate candidate molecules arose as promising substitutes (Gómez-Bombarelli et al., 2016; Lim et al., 2018; Schwalbe-Koda & Gómez-Bombarelli, 2019) for conventional experimental drug discovery approaches via trial-and-error processes with human efforts. However, most existing molecule generation models have the following two limitations, which limit their practical impact. First of all, the common pitfall of the models based on distributional learning is that the exploration is confined to the training distribution, and the generated molecules highly resemble those in the training set. For example, Walters & Murcko (2020) point out that the top-scoring molecule found by the model of Zhavoronkov et al. (2019) exhibits "striking similarity" to known active molecules included in the training set (see Figure 1 (Left; a1, a2)). This highly limits its applicability to de novo drug discovery which aims to find completely new molecules rather than slight variations of existing ones, emphasizing the need for a generation strategy that can generate out-of-distribution (OOD) molecules with desired properties. Secondly, there exists a discrepancy between the target chemical properties of the molecule generation models and those in real-world scenarios. The most common properties utilized by the molecule generation models are penalized logP and quantitative estimate of drug-likeness (QED) (Jin et al., 2018; You et al., 2018; Shi et al., 2019; Zang & Wang, 2020; Luo et al., 2021c; Liu et al., 2021) . However, as criticized by Coley (2020), Cieplinski et al. (2020), and Xie et al. (2020) , optimization of these scores may not lead to the discovery of useful drugs. For example, the top-scoring molecule found in terms of penalized logP in the state-of-the-art model is a trivial long chain of the maximum number of carbons (Luo et al., 2021c) , since penalized logP prefers large molecules. To overcome such a limitation of conventional property objectives, a few recent works adopted the docking score, a binding affinity score based on the three-dimensional simulation of a target protein and a drug candidate (Cieplinski et al., 2020). However, using the docking score as a sole metric is still insufficient as a reasonable proxy for drug activity, since heavy molecules with high docking scores are likely to be false positives due to the dependency of the docking score on molecular weights (Pan et al., 2003) . Furthermore, real-world drug discovery involves searching for molecules that meet multiple requirements, for example, protein-ligand interactions, drug-likeness, and synthesizability. Unfortunately, the poor explorability of most existing drug discovery methods makes it difficult to successfully accomplish the multi-objective tasks. As the number of chemical requirements increases, fewer molecules in the training set will satisfy the given constraints, and the optimization problem will become more difficult when trying to generate molecules that meet all the requirements. Thus, to generate high-scoring molecules with respect to multiple chemical properties, and further, that are applicable to the real-world, we need a method that can more effectively explore the chemical space. To this end, we propose a novel de novo drug discovery framework for generating OOD molecules, that are completely different from those in the training set, but nonetheless satisfy the given constraints. Specifically, we first propose a score-based generative model for OOD generation, by deriving a novel OOD-controlled reverse-time diffusion process that can control the amount of deviation from the data distribution. However, since the naïve OOD generation can yield molecules that are chemically implausible, difficult to synthesize, and lacking desired properties, we further extend our framework to perform conditional generation for property optimization. Our Molecular Out-Of-distribution Diffusion (MOOD) framework utilizes the gradient of a property prediction network to guide the sampling process to domains that are highly likely to satisfy the given constraints, while leveraging the proposed OOD control to explore beyond the space of known molecules. MOOD is able to generate molecules that lie beyond the training distribution without additional computational costs, unlike existing methods (e.g., RL-based exploration methods). We experimentally validate the proposed MOOD on the molecule optimization task, on which MOOD outperforms state-of-the-art molecule generation methods by generating novel molecules with high docking scores while satisfying QED and synthetic accessibility (SA) conditions, demonstrating its ability to effectively explore the chemical space and find chemical optima of multiple requirements. Notably, MOOD discovered a novel molecule (Figure 1 (Left; b1)) with a higher docking score than the top 0.01% of the training dataset. We summarize our contributions as follows: • We devise a novel score-based generative model for OOD generation, which overcomes the limited explorability of previous generative models by leveraging our proposed OOD-controlled reverse-time diffusion process that can control the amount of deviation from the data distribution. • We propose a novel score-based generative framework for molecule optimization which leverages the gradients of the property prediction network to guide the generation process, while extending the exploration space with the OOD control. • We experimentally demonstrate that our proposed conditional OOD molecule generation framework can generate novel molecules that are drug-like, synthesizable, and have high docking scores on five protein targets, outperforming existing molecule generation methods, and even discovering novel molecules that outscore the top molecules in the original dataset.



Figure 1: (Left) The molecules found by GENTRL (Zhavoronkov et al., 2019) and MOOD, and the most similar molecules to those from the training set. Unlike GENTRL, MOOD discovered a novel molecule that is different from any training molecule with a higher docking score than the top 0.01% of the training set. (Right) Illustration of the reverse-time diffusion process of MOOD. MOOD leverages the OOD-controlled diffusion to extend the exploration boundary and generate OOD samples in the low-density region, while using the property prediction network to guide the sampling process to the high-property region, thereby discovering molecules with desired target properties that lie beyond the training distribution. MOOD-ID is the variant of MOOD that only utilizes the property prediction network without the OOD control.

