A SELF-ATTENTION ANSATZ FOR AB-INITIO QUANTUM CHEMISTRY

Abstract

We present a novel neural network architecture using self-attention, the Wavefunction Transformer (Psiformer), which can be used as an approximation (or Ansatz) for solving the many-electron Schrödinger equation, the fundamental equation for quantum chemistry and material science. This equation can be solved from first principles, requiring no external training data. In recent years, deep neural networks like the FermiNet and PauliNet have been used to significantly improve the accuracy of these first-principle calculations, but they lack an attention-like mechanism for gating interactions between electrons. Here we show that the Psiformer can be used as a drop-in replacement for these other neural networks, often dramatically improving the accuracy of the calculations. On larger molecules especially, the ground state energy can be improved by dozens of kcal/mol, a qualitative leap over previous methods. This demonstrates that self-attention networks can learn complex quantum mechanical correlations between electrons, and are a promising route to reaching unprecedented accuracy in chemical calculations on larger systems.

1. INTRODUCTION

The laws of quantum mechanics describe the nature of matter at the microscopic level, and underpin the study of chemistry, condensed matter physics and material science. Although these laws have been known for nearly a century (Schrödinger, 1926) , the fundamental equations are too difficult to solve analytically for all but the simplest systems. In recent years, tools from deep learning have been used to great effect to improve the quality of computational quantum physics (Carleo & Troyer, 2017) . For the study of chemistry in particular, it is the quantum behavior of electrons that matters, which imposes certain constraints on the possible solutions. The use of deep neural networks for successfully computing the quantum behavior of molecules was introduced almost simultaneously by several groups (Pfau et al., 2020; Hermann et al., 2020; Choo et al., 2020) , and has since led to a variety of extensions and improvements (Hermann et al., 2022) . However, follow-up work has mostly focused on applications and iterative improvements to the neural network architectures introduced in the first set of papers. At the same time, neural networks using self-attention layers, like the Transformer (Vaswani et al., 2017) , have had a profound impact on much of machine learning. They have led to breakthroughs in natural language processing (Devlin et al., 2018) , language modeling (Brown et al., 2020), image recognition (Dosovitskiy et al., 2020) , and protein folding (Jumper et al., 2021) . The basic selfattention layer is also permutation equivariant, a useful property for applications to chemistry, where physical quantities should be invariant to the ordering of atoms and electrons (Fuchs et al., 2020) . Despite the manifest successes in other fields, no one has yet investigated whether self-attention neural networks are appropriate for approximating solutions in computational quantum mechanics. In this work, we introduce a new self-attention neural network, the Wavefunction Transformer (Psiformer), which can be used as an approximate numerical solution (or Ansatz) for the fundamental equations of the quantum mechanics of electrons. We test the Psiformer on a wide variety of benchmark systems for quantum chemistry and find that it is significantly more accurate than existing neural network Ansatzes of roughly the same size. The increase in accuracy is more pronounced the larger the system is -as much as 75 times the normal standard for "chemical accuracy" -suggesting that the Psiformer is a particularly attractive approach for scaling neural network Ansatzes to larger, 1

