BOOSTING THE CYCLE COUNTING POWER OF GRAPH NEURAL NETWORKS WITH I 2 -GNNS

Abstract

Message Passing Neural Networks (MPNNs) are a widely used class of Graph Neural Networks (GNNs). The limited representational power of MPNNs inspires the study of provably powerful GNN architectures. However, knowing one model is more powerful than another gives little insight about what functions they can or cannot express. It is still unclear whether these models are able to approximate specific functions such as counting certain graph substructures, which is essential for applications in biology, chemistry and social network analysis. Motivated by this, we propose to study the counting power of Subgraph MPNNs, a recent and popular class of powerful GNN models that extract rooted subgraphs for each node, assign the root node a unique identifier and encode the root node's representation within its rooted subgraph. Specifically, we prove that Subgraph MPNNs fail to count more-than-4-cycles at node level, implying that node representations cannot correctly encode the surrounding substructures like ring systems with more than four atoms. To overcome this limitation, we propose I 2 -GNNs to extend Subgraph MPNNs by assigning different identifiers for the root node and its neighbors in each subgraph. I 2 -GNNs' discriminative power is shown to be strictly stronger than Subgraph MPNNs and partially stronger than the 3-WL test. More importantly, I 2 -GNNs are proven capable of counting all 3, 4, 5 and 6-cycles, covering common substructures like benzene rings in organic chemistry, while still keeping linear complexity. To the best of our knowledge, it is the first linear-time GNN model that can count 6-cycles with theoretical guarantees. We validate its counting power in cycle counting tasks and demonstrate its competitive performance in molecular prediction benchmarks.

1. INTRODUCTION

Relational and structured data are usually represented by graphs. Representation learning over graphs with Graph Neural Networks (GNNs) has achieved remarkable results in drug discovery, computational chemistry, combinatorial optimization and social network analysis (Bronstein et al., 2017; Duvenaud et al., 2015; Khalil et al., 2017; Kipf & Welling, 2016; Stokes et al., 2020; You et al., 2018; Zhang & Chen, 2018 ). Among various GNNs, Message Passing Neural Network (MPNN) is one of the most commonly used GNNs (Zhou et al., 2020; Veličković et al., 2017; Scarselli et al., 2008) . However, the representational power of MPNNs is shown to be limited by the Weisfeiler-Lehman (WL) test (Xu et al., 2018; Morris et al., 2019) , a classical algorithm for graph isomorphism test. MPNNs cannot recognize even some simple substructures like cycles (Chen et al., 2020) . It leads to increasing attention on studying the representational power of different GNNs and designing more powerful GNN models. The representational power of a GNN model can be evaluated from two perspectives. One is the ability to distinguish a pair of non-isomorphic graphs, i.e., discriminative power. Chen et al. (2019) show the equivalence between distinguishing all pairs of non-isomorphic graphs and approximating all permutation invariant functions (universal approximation). Though the discriminative power provides a way to compare different models, for most GNN models without universal approximation

