N -WL: A NEW HIERARCHY OF EXPRESSIVITY FOR GRAPH NEURAL NETWORKS

Abstract

The expressive power of Graph Neural Networks (GNNs) is fundamental for understanding their capabilities and limitations, i.e., what graph properties can or cannot be learnt by a GNN. Since standard GNNs have been characterised to be upper-bounded by the Weisfeiler-Lehman (1-WL) algorithm, recent attempts concentrated on developing more expressive GNNs in terms of the k-WL hierarchy, a well-established framework for graph isormorphism tests. In this work we show that, contrary to the widely accepted view, the k-WL hierarchy is not well-suited for measuring expressive GNNs. This is due to limitations that are inherent to highdimensional WL algorithms such as the lack of a natural interpretation and high computational costs, which makes it difficult to draw any firm conclusions about the expressive power of GNNs beyond 1-WL. Thus, we propose a novel hierarchy of graph isomorphism tests, namely Neighbourhood WL (N -WL), and also establish a new theorem on the equivalence of expressivity between induced connected subgraphs and induced subgraphs within this hierarchy. Further, we design a GNN model upon N -WL, Graph Neighbourhood Neural Network (G3N), and empirically verify its expressive power on synthetic and real-world benchmarks.

1. INTRODUCTION

Graph-theoretic algorithms are a powerful source of inspiration for Graph Neural Networks (GNNs). The most known is that the expressive power of standard GNNs is upper-bounded by the Weisfeiler-Lehman (1-WL) algorithm (Weisfeiler & Leman, 1968; Xu et al., 2019; Morris et al., 2019) . In pursuit of more expressive GNNs, various attempts have been made to leverage existing results in graph theory such as high-dimensional WL algorithms (Azizian & Lelarge, 2021; Maron et al., 2019a; Morris et al., 2020b) , substructure counting (Bouritsas et al., 2022; Barceló et al., 2021), and individualisation (Dupty et al., 2022) . The expressivity of these GNNs is measured in terms of the k-WL hierarchy, a well-established framework for graph isomorphism testing (Grohe, 2017). However, the k-WL hierarchy exhibits several theoretical and practical limitations as a measure of expressivity for GNNs. Theoretically, it is a highly non-trivial problem to tell if and when k-WL algorithms can distinguish two particular graphs (Kiefer, 2020). Deciding which graph properties are important for distinguishing graphs is even much harder, if not impossible. A complete description of all subgraph patterns whose counts and occurrence are k-WL invariant is only available for k = 1 (Arvind et al., 2020). Even bearing high computational costs, the power of k-WL algorithms in recognising graph properties seems still limited and some negative results are known, e.g., 3-WL cannot identify any k-cliques with k > 3 (Fürer, 2017). These issues hamper the practical applicability of high-dimensional WL algorithms for solving real-world tasks on graph-structured data (Chen et al., 2020; Garg et al., 2020) . A question that arises from this is -Whether the k-WL hierarchy is a good yardstick for expressivity of GNNs? In the search for an answer to this question, we observe several disparities between (standard) GNNs and the k-WL hierarchy. First, GNNs encode structural information into nodes as an efficient and practical way for graph learning. This is however against the spirit of the k-WL hierarchy which increases expressive power by going up to higher order objects, i.e., k-tuples, rather than just nodes (Cai et al., 1992; Grohe, 2017) . Second, GNNs are built upon a natural notion of local

