EXPRESSIVE POWER OF INVARIANT AND EQUIVARIANT GRAPH NEURAL NETWORKS

Abstract

Various classes of Graph Neural Networks (GNN) have been proposed and shown to be successful in a wide range of applications with graph structured data. In this paper, we propose a theoretical framework able to compare the expressive power of these GNN architectures. The current universality theorems only apply to intractable classes of GNNs. Here, we prove the first approximation guarantees for practical GNNs, paving the way for a better understanding of their generalization. Our theoretical results are proved for invariant GNNs computing a graph embedding (permutation of the nodes of the input graph does not affect the output) and equivariant GNNs computing an embedding of the nodes (permutation of the input permutes the output). We show that Folklore Graph Neural Networks (FGNN), which are tensor based GNNs augmented with matrix multiplication are the most expressive architectures proposed so far for a given tensor order. We illustrate our results on the Quadratic Assignment Problem (a NP-Hard combinatorial problem) by showing that FGNNs are able to learn how to solve the problem, leading to much better average performances than existing algorithms (based on spectral, SDP or other GNNs architectures). On a practical side, we also implement masked tensors to handle batches of graphs of varying sizes.

1. INTRODUCTION

Graph Neural Networks (GNN) are designed to deal with graph structured data. Since a graph is not changed by permutation of its nodes, GNNs should be either invariant if they return a result that must not depend on the representation of the input (typically when building a graph embedding) or equivariant if the output must be permuted when the input is permuted (typically when building an embedding of the nodes). More fundamentally, incorporating symmetries in machine learning is a fundamental problem as it allows to reduce the number of degree of freedom to be learned. Deep learning on graphs. This paper focuses on learning deep representation of graphs with network architectures, namely GNN, designed to be invariant to permutation or equivariant by permutation. From a practical perspective, various message passing GNNs have been proposed, see Dwivedi et al. (2020) for a recent survey and benchmarking on learning tasks. In this paper, we study 3 architectures: Message passing GNN (MGNN) which is probably the most popular architecture used in practice, order-k Linear GNN (k-LGNN) proposed in Maron et al. ( 2018) and order-k Folklore GNN (k-FGNN) first introduced by Maron et al. (2019a). MGNN layers are local thus highly parallelizable on GPUs which make them scalable for large sparse graphs. k-LGNN and k-FGNN are dealing with representations of graphs as tensors of order k which make them of little practical use for k ≥ 3. In order to compare these architectures, the separating power of these networks has been compared to a hierarchy of graph invariants developed for the graph isomorphism problem. Namely, for k ≥ 2, k-WL(G) are invariants based on the Weisfeiler-Lehman tests (described in Section 4.1). For each k ≥ 2, (k + 1)-WL has strictly more separating power than k-WL (in the sense that there is a pair of non-isomorphic graphs distinguishable by (k + 1)-WL and not by k-WL). GIN (which are invariant MGNN) introduced in Xu et al. ( 2018) are shown to be as powerful as 2-WL. In Maron et al. (2019a) , Geerts (2020b) and Geerts (2020a), k-LGNN are shown to be as powerful as k-WL and 2-FGNN is shown to be as powerful as 3-WL. In this paper, we extend this last result about k-FGNN to general values of k. So in term of separating power, when restricted to tensors of order k, k-FGNN is the

