E3BIND: AN END-TO-END EQUIVARIANT NETWORK FOR PROTEIN-LIGAND DOCKING

Abstract

In silico prediction of the ligand binding pose to a given protein target is a crucial but challenging task in drug discovery. This work focuses on flexible blind selfdocking, where we aim to predict the positions, orientations and conformations of docked molecules. Traditional physics-based methods usually suffer from inaccurate scoring functions and high inference costs. Recently, data-driven methods based on deep learning techniques are attracting growing interest thanks to their efficiency during inference and promising performance. These methods usually either adopt a two-stage approach by first predicting the distances between proteins and ligands and then generating the final coordinates based on the predicted distances, or directly predicting the global roto-translation of ligands. In this paper, we take a different route. Inspired by the resounding success of AlphaFold2 for protein structure prediction, we propose E3Bind, an end-to-end equivariant network that iteratively updates the ligand pose. E3Bind models the protein-ligand interaction through careful consideration of the geometric constraints in docking and the local context of the binding site. Experiments on standard benchmark datasets demonstrate the superior performance of our end-to-end trainable model compared to traditional and recently-proposed deep learning methods.

1. INTRODUCTION

For nearly a century, small molecules, or organic compounds with small molecular weight, have been the major weapon of the pharmaceutical industry. They take effect by ligating (binding) to their target, usually a protein, to alter the molecular pathways of diseases. The structure of the proteinligand interface holds the key to understanding the potency, mechanisms and potential side effects of small molecule drugs. Despite huge efforts made for protein-ligand complex structure determination, there are by far only some 10 4 protein-ligand complex structures available in the protein data bank (PDB) (Berman et al., 2000) , which dwarfs in front of the enormous combinatorial space of possible complexes between 10 60 drug-like molecules (Hert et al., 2009; Reymond & Awale, 2012) and at least 20,000 human proteins (Gaudet et al., 2017; Consortium, 2019) , highlighting the urgent need for in silico protein-ligand docking methods. Furthermore, a fast and accurate docking tool capable of predicting binding poses for molecules yet to be synthesized would empower mass-scale virtual screening (Lyu et al., 2019) , a vital step in modern structure-based drug discovery (Ferreira et al., 2015) . It also provides pharmaceutical scientists with an interpretable, information-rich result. Being a crucial task, predicting the docked pose of a ligand is also a challenging one. Traditional docking methods (Halgren et al., 2004; Morris et al., 1996; Trott & Olson, 2010; Coleman et al., 2013) rely on physics-inspired scoring functions and extensive conformation sampling to obtain the predicted binding pose. Some deep learning methods focus on learning a more accurate scoring function (McNutt et al., 2021; Méndez-Lucio et al., 2021) , but at the cost of even lower inference speed due to their adoption of the sampling-scoring framework. Distinct from the above methods, TankBind (Lu et al., 2022) drops the burden of conformation sampling by predicting the proteinligand distance map, then converting the distance map to a docked pose using gradient descent. The * Equal contribution 1

