DIFFDOCK: DIFFUSION STEPS, TWISTS, AND TURNS FOR MOLECULAR DOCKING

Abstract

Predicting the binding structure of a small molecule ligand to a protein-a task known as molecular docking-is critical to drug design. Recent deep learning methods that treat docking as a regression problem have decreased runtime compared to traditional search-based methods but have yet to offer substantial improvements in accuracy. We instead frame molecular docking as a generative modeling problem and develop DIFFDOCK, a diffusion generative model over the non-Euclidean manifold of ligand poses. To do so, we map this manifold to the product space of the degrees of freedom (translational, rotational, and torsional) involved in docking and develop an efficient diffusion process on this space. Empirically, DIFFDOCK obtains a 38% top-1 success rate (RMSD<2 Å) on PDB-Bind, significantly outperforming the previous state-of-the-art of traditional docking (23%) and deep learning (20%) methods. Moreover, while previous methods are not able to dock on computationally folded structures (maximum accuracy 10.4%), DIFFDOCK maintains significantly higher precision (21.7%). Finally, DIFFDOCK has fast inference times and provides confidence estimates with high selective accuracy.

1. INTRODUCTION

The biological functions of proteins can be modulated by small molecule ligands (such as drugs) binding to them. Thus, a crucial task in computational drug design is molecular docking-predicting the position, orientation, and conformation of a ligand when bound to a target protein-from which the effect of the ligand (if any) might be inferred. Traditional approaches for docking [Trott & Olson, 2010; Halgren et al., 2004] rely on scoring-functions that estimate the correctness of a proposed structure or pose, and an optimization algorithm that searches for the global maximum of the scoring function. However, since the search space is vast and the landscape of the scoring functions rugged, these methods tend to be too slow and inaccurate, especially for high-throughput workflows. Recent works [Stärk et al., 2022; Lu et al., 2022] have developed deep learning models to predict the binding pose in one shot, treating docking as a regression problem. While these methods are much faster than traditional search-based methods, they have yet to demonstrate significant improvements in accuracy. We argue that this may be because the regression-based paradigm corresponds imperfectly with the objectives of molecular docking, which is reflected in the fact that standard accuracy metrics resemble the likelihood of the data under the predictive model rather than a regression loss. We thus frame molecular docking as a generative modeling problem-given a ligand and target protein structure, we learn a distribution over ligand poses. To this end, we develop DIFFDOCK, a diffusion generative model (DGM) over the space of ligand poses for molecular docking. We define a diffusion process over the degrees of freedom involved in docking: the position of the ligand relative to the protein (locating the binding pocket), its orientation in the pocket, and the torsion angles describing its conformation. DIFFDOCK samples poses by running the learned (reverse) diffusion process, which iteratively transforms an uninformed, noisy prior distribution over ligand poses into the learned model distribution (Figure 1 ). Intuitively, this process can be viewed as the progressive refinement of random poses via updates of their translations, rotations, and torsion angles. * Equal contribution. Correspondence to {gcorso, hstark, bjing}@mit.edu. 1

