BREAKING THE EXPRESSIVE BOTTLENECKS OF GRAPH NEURAL NETWORKS Anonymous authors Paper under double-blind review

Abstract

Recently, the Weisfeiler-Lehman (WL) graph isomorphism test was used to measure the expressiveness of graph neural networks (GNNs), showing that the neighborhood aggregation GNNs were at most as powerful as 1-WL test in distinguishing graph structures. There were also improvements proposed in analogy to k-WL test (k ą 1). However, the aggregators in these GNNs are far from injective as required by the WL test, and suffer from weak distinguishing strength, making it become expressive bottlenecks. In this paper, we improve the expressiveness by exploring powerful aggregators. We reformulate aggregation with the corresponding aggregation coefficient matrix, and then systematically analyze the requirements of the aggregation coefficient matrix for building more powerful aggregators and even injective aggregators. It can also be viewed as the strategy for preserving the rank of hidden features, and implies that basic aggregators correspond to a special case of low-rank transformations. We also show the necessity of applying nonlinear units ahead of aggregation, which is different from most aggregation-based GNNs. Based on our theoretical analysis, we develop two GNN layers, Expand-ingConv and CombConv. Experimental results show that our models significantly boost performance, especially for large and densely connected graphs.

1. INTRODUCTION

Graphs are ubiquitous in the real world. Social networks, traffic networks, knowledge graphs, and molecular structures are typical graph-structured data. Graph Neural Networks (GNNs) (Scarselli et al., 2008; Gori et al., 2005) , leveraging the power of neural networks to graph-structured data, have a rapid development recently (Kipf & Welling, 2016; Hamilton et al., 2017; Bronstein et al., 2017; Gilmer et al., 2017; Duvenaud et al., 2015) . Expressive power of GNNs measures their abilities to represent different graph structures (Sato, 2020) . It decides the performance of GNNs where the awareness of graph structures is required, especially on large graphs with complex topologies. The neighborhood aggregation scheme (or message passing) follows the same pattern with weisfiler-lehman (WL) graph isomorphism test (Weisfeiler & Leman, 1968 ) to encode graph structures, where node representations are computed iteratively by aggregating transformed representations of its neighbors with structural information learned implicitly. Therefore, the WL test is used to measure the expressiveness of GNNs. Unfortunately, general GNNs are at most as powerful as 1-order WL test (Morris et al., 2019; Xu et al., 2019) . There is also work trying to improve the expressiveness that are beyond 1-order WL test (Maron et al., 2019; Morris et al., 2019; Chen et al., 2019; Li et al., 2020b; Vignac et al., 2020) . However, the weak distinguishing strength of aggregators is the fundamental limitation. The expressiveness analysis measured by the WL test assumes that aggregators are injective, which is usually unreachable. Therefore, this motivates us to investigate the following questions: What are the key factors to limit the expressiveness of GNN? and how to break these limitations? Aggregators are permutation invariant functions that operate on sets while preserving permutation invariance. (Zaheer et al., 2017) first theoretically studied permutation invariant functions and provided a family of functions to which any permutation invariant function must belong. (Xu et al., 2019) extended it on multisets but only for countable space. (Corso et al., 2020) further extended it to uncountable space. (Murphy et al., 2018) and (Murphy et al., 2019) expressed a permutation invariant function by approximating an average over permutation-sensitive functions with tractability

