FORCENET: A GRAPH NEURAL NETWORK FOR LARGE-SCALE QUANTUM CHEMISTRY SIMULATION

Abstract

Machine Learning (ML) has a potential to dramatically accelerate large-scale physics-based simulations. However, practical models for real large-scale and complex problems remain out of reach. Here we present ForceNet, a model for accurate and fast quantum chemistry simulations to accelerate catalyst discovery for renewable energy applications. ForceNet is a graph neural network that uses surrounding 3D molecular structure to estimate per-atom forces-a central capability for performing atomic simulations. The key challenge is to accurately capture highly complex and non-linear quantum interactions of atoms in 3D space, on which forces are dependent. To this end, ForceNet adopts (1) expressive message passing architecture, (2) appropriate choice of basis and non-linear activation functions, and (3) model scaling in terms of network depth and width. We show ForceNet reduces the estimation error of atomic forces by 30% compared to existing ML models, and generalizes well to out-of-distribution structures. Finally, we apply ForceNet to the large-scale catalyst dataset, OC20. We use ForceNet to perform quantum chemistry simulations, where ForceNet is able to achieve 4× higher success rate than existing ML models. Overall, we demonstrate the potential for ML-based simulations to achieve practical usefulness while being orders of magnitude faster than physics-based simulations.

1. INTRODUCTION

Learning models for simulating complex physical systems has attracted much recent attention (Sanchez-Gonzalez et al., 2020; Bapst et al., 2020; Kipf et al., 2018; Battaglia et al., 2016; Gilmer et al., 2017; Schütt et al., 2017; Klicpera et al., 2020) . The premise is that once an accurate ML-based simulator is obtained, it can perform inference orders-of-magnitude faster than the original underlying physics-based simulator. Many existing works have focused on relatively small-scale simple domains, where physics-based simulation is cheap (i.e., order of seconds), such as simulating springs and oscillators (Battaglia et al., 2016; Kipf et al., 2018) , fluids and rigid solids (Sanchez-Gonzalez et al., 2020) that follow classical Newtonian dynamics, and glassy systems (Bapst et al., 2020) that follow the basic Lennard-Jones potential. Other approaches have explored complex domains with smaller systems, such as small organic molecules (Ramakrishnan et al., 2014; Schütt et al., 2017; Klicpera et al., 2020) . An important open question is whether ML-based simulation is effective in larger and more complex quantum chemistry domains. If successful, ML could be applied to problems such as catalyst discovery which is key to solving many societal and energy challenges including solar fuels synthesis, long-term energy storage, and renewable fertilizer production (Seh et al. (2017); Jouny et al. (2018) ; Whipple & Kenis (2010)). Currently, Density Functional Theory (DFT) (Parr, 1980) is a popular and reliable, but computationally expensive approach to performing quantum chemistry simulation tasks, such as atomic structure relaxation (Figure 1 (left)) and molecular dynamics. Approximating DFT with ML-models is an exceptionally challenging task given a simulation's sensitivity to the atoms' elements and subtle changes in the atoms' positions. However, if successful, accurate and fast ML-based models may lead to significant practical impact by accelerating simulations from O(hours-days) to O(ms-s), which in turn accelerates applications such as catalyst discovery. A key component of any ML-model used for atomic simulations is predicting the non-linear and complex forces applied on atoms by their neighbors. These forces are highly sensitive to the element type and small changes in placement of neighboring atoms. One approach that is well-suited to the modeling of local interactions is Graph Neural Networks (GNNs) (Gilmer et al., 2017) where nodes represent atoms and messages passed along edges represent the atom interactions. In this paper, we present ForceNet that demonstrates a significant improvement in quantum simulation performance, and provides strong evidence that GNN-based simulation is effective in practicallyrelevant and realistic domains. We build on the recent framework of Graph Network-based Simulators (GNS) (Sanchez-Gonzalez et al., 2020; Bapst et al., 2020) , where node movements (i.e., atomic forces in our application) are predicted from the node embeddings of a GNN. We address the key challenge of accurately capturing the local 3D atomic structure using three approaches: Conditional filter convolution: To model the complex atom interactions in 3D space, we extend the continuous filter convolution (Schütt et al., 2017) to make it much more expressive; we condition the convolution filter not only on the Euclidean distance between atoms, but also on the embeddings of source and target nodes as well as the (x, y, z) directional differences.

Carefully-chosen basis and activation functions:

We demonstrate that it is critical to carefully select the appropriate non-linear activation and basis function in the message-passing architecture, so that complex non-linear dynamics can be effectively captured. Through comprehensive empirical studies, we identify a particularly effective design choice that is based on spherical harmonics and Swish activation (Ramachandran et al., 2017) . Scaling: The scaling of model size is necessary for capturing the complexity of the forces. The model is scaled with respect to depth and width to significantly improve its performance. We apply our model to the new large-scale quantum chemistry dataset OC20 (Anonymous, 2020) that contains 200+ million samples from atomic relaxations relevant to the discovery of new catalysts for renewable energy storage and other energy applications. The result of our work is a simple and scalable ForceNet model that achieves state-of-the-art performance in predicting quantum force fields, reducing the MAE force errors by 30% compared to existing GNN models. Similar performance gains are obtained on out-of-distribution tasks for which similar atomic structures are not seen during training. Finally, we use ForceNet as a surrogate to DFT to perform quantum chemistry simulation; specifically, calculating structure relaxations of complex systems, a task of high practical value. We show that ForceNet is able to estimate the relaxed structures 4× more accurately than previous state-of-the-art, while being multiple-orders of magnitude faster than DFT, Figure 1 .

2. RELATED WORK

Message-passing GNNs. Our ForceNet is based on message passing GNNs (Gilmer et al., 2017) that iteratively update node embeddings based on messages passed from neighboring nodes. In its most general form, the message function depends on the two node embeddings as well as edge



Figure 1: (left) Illustrative example of atomic structure energies during relaxations performed using DFT and ML-based (SchNet, GNS, ForceNet) approaches. 3D renderings of the structures along the relaxation trajectory are shown, where small atoms are adsorbates and larger atoms are catalysts. The ForceNet (ours) model reaches a low energy similar to DFT. (right) Same plot, but changing the x-axis to time rather than iteration steps. All ML models are more than 10 3 × faster (GPU acceleration of DFT is only 2-5× (Maintz & Wetzstein, 2018)), while only ForceNet finds similar energy to that of DFT in this example. Note different optimizers were used for DFT (conjugate gradient) vs. ForceNet (L-BFGS), which may have led to them reaching the same energy by different trajectories.

