ON REPRESENTING (ANTI)SYMMETRIC FUNCTIONS

Abstract

Permutation-invariant, -equivariant, and -covariant functions and anti-symmetric functions are important in quantum physics, computer vision, and other disciplines. (Anti)symmetric neural networks have recently been developed and applied with great success. A few theoretical approximation results have been proven, but many questions are still open, especially for particles in more than one dimension and the anti-symmetric case, which this work focusses on. More concretely, we derive natural polynomial approximations in the symmetric case, and approximations based on a single generalized Slater determinant in the antisymmetric case. Unlike some previous super-exponential and discontinuous approximations, these seem a more promising basis for future tighter bounds. In the supplementary we also provide a complete and explicit universality proof of the Equivariant MultiLayer Perceptron, which implies universality of symmetric MLPs and the FermiNet.

1. INTRODUCTION

Neural Networks (NN), or more precisely, Multi-Layer Perceptrons (MLP), are universal function approximators [Pin99] in the sense that every (say) continuous function can be approximated arbitrarily well by a sufficiently large NN. The true power of NN though stems from the fact that they apparently have a bias towards functions we care about and that they can be trained by local gradient-descent or variations thereof. For many problems we have additional information about the function, e.g. symmetries under which the function of interest is invariant or covariant. Here we consider functions that are covariant x under permutations. y Of particular interest are functions that are invariant z , equivariant { , or antisymmetric | under permutations. Definition 1 ((Anti)symmetric and equivariant functions) A function φ : X n → R in n ∈ N variables is called symmetric iff φ(x 1 , ..., x n ) = φ(x π(1) , ..., x π(n) ) for all x 1 , ..., x n ∈ X for all permutations π ∈ S n , where S n := {π : {1 : n} → {1 : n} ∧ π is bijection} is called the symmetric group and {1 : n} is short for {1, ..., n}. Similarly, a function ψ : X n → R is called anti-symmetric (AS) iff ψ(x 1 , ..., x n ) = σ(π)ψ(x π(1) , ..., x π(n) ) , where σ(π) = ±1 is the parity or sign of permutation π. A function ϕ : X n → X n is called equivariant under permutations iff ϕ(S π (x)) = S π (ϕ(x)), where x ≡ (x 1 , ..., x n ) and S π (x 1 , ..., x n ) := (x π(1) , ..., x π(n) ). Of course (anti)symmetric functions are also just functions, hence a NN of sufficient capacity can also represent (anti)symmetric functions, and if trained on an (anti)symmetric target could converge to an (anti)symmetric function. But NNs that can represent only (anti)symmetric functions are desirable for multiple reasons. Equivariant MLP (EMLP) are the basis for constructing symmetric functions by simply summing the output of the last layer, and for anti-symmetric (AS) functions by x In full generality, a function f : X → Y is covariant under group operations g ∈ G, if f (R X g (x)) = R Y g (f (x)) , where R X g : X → X and R Y g : Y → Y are representations of group (element) g ∈ G. y The symmetric group G = Sn is the group of all permutations=bijections π : {1, ..., n} → {1, ..., n}. z R Y g =Identity. Permutation-invariant functions are also called 'totally symmetric functions' or simply 'symmetric function'. { General Y and X , often Y = X and R Y g = R X g , also called covariant. | R Y g = ±1 for even/odd permutations.

