IMPROVING GRAPH NEURAL NETWORK EXPRESSIV-ITY VIA SUBGRAPH ISOMORPHISM COUNTING

Abstract

While Graph Neural Networks (GNNs) have achieved remarkable results in a variety of applications, recent studies exposed important shortcomings in their ability to capture the structure of the underlying graph. It has been shown that the expressive power of standard GNNs is bounded by the Weisfeiler-Lehman (WL) graph isomorphism test, from which they inherit proven limitations such as the inability to detect and count graph substructures. On the other hand, there is significant empirical evidence, e.g. in network science and bioinformatics, that substructures are often informative for downstream tasks, suggesting that it is desirable to design GNNs capable of leveraging this important source of information. To this end, we propose a novel topologically-aware message passing scheme based on substructure encoding. We show that our architecture allows incorporating domain-specific inductive biases and that it is strictly more expressive than the WL test. Importantly, in contrast to recent works on the expressivity of GNNs, we do not attempt to adhere to the WL hierarchy; this allows us to retain multiple attractive properties of standard GNNs such as locality and linear network complexity, while being able to disambiguate even hard instances of graph isomorphism. We extensively evaluate our method on graph classification and regression tasks and show state-of-the-art results on multiple datasets including molecular graphs and social networks.

1. INTRODUCTION

The field of graph representation learning has undergone a rapid growth in the past few years. In particular, Graph Neural Networks (GNNs), a family of neural architectures designed for irregularly structured data, have been successfully applied to problems ranging from social networks and recommender systems (Ying et al., 2018a) to bioinformatics (Fout et al., 2017; Gainza et al., 2020 ), chemistry (Duvenaud et al., 2015; Gilmer et al., 2017; Sanchez-Lengeling et al., 2019) and physics (Kipf et al., 2018; Battaglia et al., 2016) , to name a few. Most GNN architectures are based on message passing (Gilmer et al., 2017) , where at each layer the nodes are updated by information aggregated from their neighbours. A crucial difference from traditional neural networks operating on grid-structured data is the absence of canonical ordering of the nodes in a graph. To address this, the aggregation function is constructed to be invariant to neighbourhood permutations and, as a consequence, to graph isomorphism. This kind of symmetry is not always desirable and thus different inductive biases that disambiguate the neighbours have been proposed. For instance, in geometric graphs, such as 3D molecular graphs and meshes, directional biases are usually employed in order to model the positional information of the nodes (Masci et al., 2015; Monti et al., 2017; Bouritsas et al., 2019; Klicpera et al., 2020; de Haan et al., 2020b) ; for proteins, ordering information is used to disambiguate amino-acids at different positions in the sequence (Ingraham et al., 2019) ; in multi-relational knowledge graphs, a different aggregation is performed for each relation type (Schlichtkrull et al., 2018) . The structure of the graph itself does not usually explicitly take part in the aggregation function. In fact, most models rely on multiple message passing steps as a means for each node to discover the global structure of the graph. However, since message-passing GNNs are at most as powerful as the Weisfeiler Lehman test (WL) (Xu et al., 2019; Morris et al., 2019) , they are limited in their abilities to adequately exploit the graph structure, e.g. by counting substructures (Arvind et al., 2019;  

