CONDITIONAL ANTIBODY DESIGN AS 3D EQUIVARI-ANT GRAPH TRANSLATION

Abstract

Antibody design is valuable for therapeutic usage and biological research. Existing deep-learning-based methods encounter several key issues: 1) incomplete context for Complementarity-Determining Regions (CDRs) generation; 2) incapability of capturing the entire 3D geometry of the input structure; 3) inefficient prediction of the CDR sequences in an autoregressive manner. In this paper, we propose Multi-channel Equivariant Attention Network (MEAN) to co-design 1D sequences and 3D structures of CDRs. To be specific, MEAN formulates antibody design as a conditional graph translation problem by importing extra components including the target antigen and the light chain of the antibody. Then, MEAN resorts to E(3)-equivariant message passing along with a proposed attention mechanism to better capture the geometrical correlation between different components. Finally, it outputs both the 1D sequences and 3D structure via a multi-round progressive full-shot scheme, which enjoys more efficiency and precision against previous autoregressive approaches. Our method significantly surpasses state-of-theart models in sequence and structure modeling, antigen-binding CDR design, and binding affinity optimization. Specifically, the relative improvement to baselines is about 23% in antigen-binding CDR design and 34% for affinity optimization.

1. INTRODUCTION

Antibodies are Y-shaped proteins used by our immune system to capture specific pathogens. They show great potential in therapeutic usage and biological research for their strong specificity: each type of antibody usually binds to a unique kind of protein that is called antigen (Basu et al., 2019) . The binding areas are mainly located at the so-called Complementarity-Determining Regions (CDRs) in antibodies (Kuroda et al., 2012) . Therefore, the critical problem of antibody design is to identify CDRs that bind to a given antigen with desirable properties like high affinity and colloidal stability (Tiller & Tessier, 2015) . There have been unremitting efforts made for antibody design by using deep generative models (Saka et al., 2021; Jin et al., 2021) . Traditional methods focus on modeling only the 1D CDR sequences, while a recent work (Jin et al., 2021) proposes to co-design the 1D sequences and 3D structures via Graph Neural Network (GNN). Despite the fruitful progress, existing approaches are still weak in modeling the spatial interaction between antibodies and antigens. For one thing, the context information is insufficiently considered. The works (Liu et al., 2020; Jin et al., 2021) only characterize the relation between CDRs and the backbone context of the same antibody chain, without the involvement of the target antigen and other antibody chains, which could lack complete clues to reflect certain important properties for antibody design, such as binding affinity. For another, they are still incapable of capturing the entire 3D geometry of the input structures. One vital property of the 3D Biology is that each structure (molecular, protein, etc) should be independent to the observation view, exhibiting E(3)-equivariance 1 . To fulfill

