TOWARDS ANTISYMMETRIC NEURAL ANSATZ SEPA-RATION

Abstract

We study separations between two fundamental models (or Ansätze) of antisymmetric functions, that is, functions f of the form f (x σ(1) , . . . , x σ(N ) ) = sign(σ)f (x 1 , . . . , x N ), where σ is any permutation. These arise in the context of quantum chemistry, and are the basic modeling tool for wavefunctions of Fermionic systems. Specifically, we consider two popular antisymmetric Ansätze: the Slater representation, which leverages the alternating structure of determinants, and the Jastrow ansatz, which augments Slater determinants with a product by an arbitrary symmetric function. We construct an antisymmetric function that can be more efficiently expressed in Jastrow form, yet provably cannot be approximated by Slater determinants unless there are exponentially (in N 2 ) many terms. This represents the first explicit quantitative separation between these two Ansätze.

1. INTRODUCTION

Neural networks have proven very successful in parametrizing non-linear approximation spaces in high-dimensions, thanks to the ability of neural architectures to leverage the physical structure and symmetries of the problem at hand, while preserving universal approximation. The successes cover many areas of engineering and computational science, from computer vision (Krizhevsky et al., 2017) to protein folding (Jumper et al., 2021) . In each case, modifying the architecture (e.g. by adding layers, adjusting the activation function, etc.) has intricate effects in the approximation, statistical and optimization errors. An important aspect in this puzzle is to first understand the approximation abilities of a certain neural architecture against a class of target functions having certain assumed symmetry (LeCun et al., 1995; Cohen et al., 2018) . For instance, symmetric functions that are permutation-invariant, ie f (x σ(1) , . . . , x σ(N ) ) = f (x 1 , . . . x N ) for all x 1 , . . . , x N and all permutations σ : {1, N } → {1, N } can be universally approximated by several neural architectures, e.g DeepSets (Zaheer et al., 2017) or Set Transformers (Lee et al., 2019); their approximation properties (Zweig & Bruna, 2022) thus offer a first glimpse on their efficiency across different learning tasks. In this work, we focus on quantum chemistry applications, namely characterizing ground states of many-body quantum systems. These are driven by the fundamental Schröndinger equation, an eigenvalue problem of the form HΨ = λΨ , where H is the Hamiltonian associated to a particle system defined over a product space Ω ⊗N , and Ψ is the wavefunction, a complex-valued function Ψ : Ω ⊗N → C whose squared modulus |Ψ(x 1 , . . . , x N )| 2 describes the probability of encountering the system in the state (x 1 , . . . , x N ) ∈ Ω ⊗N . A particularly important object is to compute the ground state, associated with the smallest eigenvalue of H. On Fermionic systems, the wavefunction satisfies an additional property, derived from Pauli's exclusion principle: the wavefunction is antisymmetric, meaning that Ψ(x σ(1) , . . . , x σ(N ) ) = sign(σ)Ψ(x 1 , . . . , x N ) . (1)

