CONFORMATION-GUIDED MOLECULAR REPRESENTA-TION WITH HAMILTONIAN NEURAL NETWORKS

Abstract

Well-designed molecular representations (fingerprints) are vital to combine medical chemistry and deep learning. Whereas incorporating 3D geometry of molecules (i.e. conformations) in their representations seems beneficial, current 3D algorithms are still in infancy. In this paper, we propose a novel molecular representation algorithm which preserves 3D conformations of molecules with a Molecular Hamiltonian Network (HamNet). In HamNet, implicit positions and momentums of atoms in a molecule interact in the Hamiltonian Engine following the discretized Hamiltonian equations. These implicit coordinations are supervised with real conformations with translation-& rotation-invariant losses, and further used as inputs to the Fingerprint Generator, a message-passing neural network. Experiments show that the Hamiltonian Engine can well preserve molecular conformations, and that the fingerprints generated by HamNet achieve stateof-the-art performances on MoleculeNet, a standard molecular machine learning benchmark.

1. INTRODUCTION

The past several years have seen a prevalence of the intersection between medical chemistry and deep learning. Remarkable progress has been made in various applications on small molecules, ranging from generation (Jin et al., 2018; You et al., 2018) and property prediction (Gilmer et al., 2017; Cho & Choi, 2019; Klicpera et al., 2020) to protein-ligand interaction analysis (Lim et al., 2019; Wang et al., 2020 ), yet all these tasks rely on well-designed numerical representations, or fingerprints, of molecules. These fingerprints encode molecular structures and serve as the indicators in downstream tasks. Early work of molecular fingerprints (Morgan, 1965; Rogers & Hahn, 2010) started from encoding the two-dimensional (2D) structures of molecules, i.e. the chemical bonds between atoms, often stored as atom-bond graphs. More recently, a trend of incorporating molecular geometry into the representations arose (Axen et al., 2017; Cho & Choi, 2019) . Molecular geometry refers to the conformation (the three-dimensional (3D) coordinations of atoms) of a molecule, which contains widely interested chemical information such as bond lengths and angles, and thus stands vital for determining physical, chemical, and biomedical properties of the molecule. Whereas incorporating 3D geometry of molecules seems indeed beneficial, 3D fingerprints, especially in combination with deep learning, are still in infancy. The use of 3D fingerprints is limited by pragmatic considerations including i) calculation costs, ii) translational & rotational invariances, and iii) the availability of conformations, especially considering the generated ligand candidates in drug discovery tasks. Furthermore, compared with current 3D algorithms, mature 2D fingerprints (Rogers & Hahn, 2010; Gilmer et al., 2017; Xiong et al., 2020) are generally more popular with equivalent or even better performances in practice. For example, as a 2D approach, Attentive Fingerprints (Attentive FP) (Xiong et al., 2020) have become the de facto state-of-the-art approach. To push the boundaries of leveraging 3D geometries in molecular fingerprints, we propose HamNet (Molecular Hamiltonian Networks). HamNet simulates the process of molecular dynamics (MD) to model the conformations of small molecules, based on which final fingerprints are calculated similarly to (Xiong et al., 2020) . To address the potential lack of labeled conformations, HamNet does not regard molecular conformations as all-time available inputs. Instead, A Hamiltonian engine is designed to reconstruct known conformations and generalize for unknown ones. Encoded from atom features, implicit positions and momentums of atoms interact in the engine following the discretized Hamiltonian Equations with learnable energy and dissipation functions. Final positions are supervised with real conformations, and further used as inputs to a Message-Passing Neural Network (MPNN) (Gilmer et al., 2017) to generate the fingerprints. Novel loss functions with translational & rotational invariances are proposed to supervise the Hamiltonian Engine, and the architecture of the Fingerprint Generator is elaborated to better incorporate the output quantities from the engine. We show via our conformation-reconstructing experiments that the proposed Hamiltonian Engine is eligible to better predict molecular conformations than conventional geometric approaches as well as common neural structures (MPNNs). We also evaluate HamNet on several datasets with different targets collected in a standard molecular machine learning benchmark, MoleculeNet (Wu et al., 2017) , all following the same experimental setups. HamNet demonstrates state-of-the-art performances, outperforming baselines including both 2D and 3D approaches.

2. PRELIMINARIES

Notations. Given a molecule with n atoms, we use v i to denote the features of atom i, and e ij that of the chemical bond between i and j (if exists). Bold, upper-case letters denote matrices, and lower-case, vectors. All vectors in this paper are column vectors, and • stands for the transpose operation. We use ⊕ for the concatenation operation of vectors. The positions and momentums of atom i are denoted as q i and p i , and the set of all positions of atoms in a molecule is denoted as Q = (q i , • • • , q n ) . N (v) refers to the neighborhood of node v in some graph.

Graph Convolutional Networks (GCNs). Given an attributed graph

G = (V, A, X), where V = {v 1 , • • • , v n } is the set of vertices, A ∈ R n×n the (weighted) adjacency matrix, and X ∈ R n×d the attribute matrix, GCNs (Kipf & Welling, 2017) calculate the hidden states of graph nodes as GCN (L) (X) ≡ H (L) , H (l+1) = σ ÂH (l) W (l) , H (0) = X, l = 0, 1, • • • , L -1. (1) Here, H = (h v1 , • • • , h vn ) are hidden representations of nodes, Â = D -1 2 AD -1 2 is the normalized adjacency matrix, D with D ii = j A ij is the diagonal matrix of node degrees, and W s are network parameters. Message-Passing Neural Networks (MPNNs). MPNN (Gilmer et al., 2017) introduced a general framework of Graph Neural Networks (GNNs). In the t-th layer of a typical MPNN, messages (m t ) are generated between two connected nodes (i, j) based on the hidden representations of both nodes (h t ) and the edge in-between. After that, nodes receive the messages and update their own hidden representations. A readout function is then defined over final node representations (h T ) to derive graph-level representations. Denoted in formula, the calculation follows m t+1 v = w∈N (v) M t (h t v , h t w , e v,w ), h t+1 v = U t (h t v , m t+1 v ), ŷ = R({h T v |v ∈ V }), where M t , U t , R are the message, update and readout functions. Hamiltonian Equations. The Hamiltonian Equations depict Newton's laws of motion in the form of first-order PDEs. Considering a system of n particles with positions (q 1 , • • • , q n ) and momentums (p 1 , • • • , p n ), the dynamics of the system follow qi ≡ dq i dt = ∂H ∂p i , ṗi ≡ dp i dt = - ∂H ∂q i , ( ) where H is the Hamiltonian of the system, and equals to the total system energy. Generally, the Hamiltonian is composed of the kinetic energy of all particles and the potential energy as H = n i=1 T i + U.



* Equal Contribution. † Corresponding Author.

