TOWARDS POWERFUL GRAPH NEURAL NETWORKS: DIVERSITY MATTERS

Abstract

Graph neural networks (GNNs) offer us an effective framework for graph representation learning via layer-wise neighborhood aggregation. Their success is attributed to their expressive power at learning representation of nodes and graphs. To achieve GNNs with high expressive power, existing methods mainly resort to complex neighborhood aggregation functions, e.g., designing injective aggregation function or using multiple aggregation functions. Consequently, their expressive power is limited by the capability of aggregation function, which is tricky to determine in practice. To combat this problem, we propose a novel framework, namely diverse sampling, to improve the expressive power of GNNs. For a target node, diverse sampling offers it diverse neighborhoods, i.e., rooted sub-graphs, and the representation of target node is finally obtained via aggregating the representation of diverse neighborhoods obtained using any GNN model. High expressive power is guaranteed by the diversity of different neighborhoods. We use classical GNNs (i.e., GCN and GAT) as base models to evaluate the effectiveness of the proposed framework. Experiments are conducted at multi-class node classification task on three benchmark datasets and multi-label node classification task on a dataset collected in this paper. Extensive experiments demonstrate the proposed method consistently improve the performance of base GNN models. The proposed framework is applicable to any GNN models and thus is general for improving the expressive power of GNNs.

1. INTRODUCTION

Graph neural networks (GNNs) have been shown to be effective at graph representation learning and many predictive tasks on graph-structured data, e.g., node classification and graph classification (Kipf & Welling, 2016; Xu et al., 2018a) . GNNs follow a neighborhood aggregation scheme, where the representation of a node is obtained by recursively aggregating and transforming representation of its neighboring nodes (Gilmer et al., 2017) . The success of GNNs is believed to be attributed to their high expressive power at learning representation of nodes and graphs (Xu et al., 2018a) . Therefore, it is an important research problem to analyze and improve the expressive power of existing GNN models and design new GNNs with high expressive power. Several recent works focus on the expressive power of GNNs. Xu et al. pointed out that the expressive power of GNNs depends on the neighborhood aggregation function (Xu et al., 2018a) . They develop a simple architecture, i.e., leveraging multi-layer perceptron (MLP) and a sum pooling as a universal approximator defined on multi-set, to achieve injective neighborhood aggregation function. With injective aggregation function in each layer, the proposed graph isomorphism network (GIN) has the expressive power as high as the Weisfeiler-Lehman (WL) graph isomorphism test (Weisfeiler & Lehman, 1968) . Similarly, Sato et al. implement a powerful GNN via consistent port numbering, i.e., mapping edges to port numbering and neighbors are ordered by the port numbering (Sato et al., 2019) . However, port ordering of CPNGNNs is not unique, and not all orderings can distinguish the same set of graphs (Garg et al., 2020) . Principal neighborhood aggregation (PNA) defines multiple aggregation functions to improve the expressive power of GNNs (Corso et al., 2020) . However, the number of required aggregation functions to discriminate multi-sets depends on the size of multi-set, which is prohibitive for real world networks with skewed degree distribution. In sum, existing methods focus on designing an injective, often complex, aggregation function in each layer to achieve GNNs with high expressive power. However, injective functions are difficult to obtain and tricky to determine in practice. Indeed, layer-wise injective function is not always required and what we need is an injective function defined over rooted sub-graphs or graphs as a whole. In this paper, we propose a novel framework, namely diverse sampling, to improve the expressive power of GNNs. For a target node, diverse sampling offers it diverse neighborhoods, i.e., rooted subgraphs, and the representation of target node is finally obtained via aggregating the representation of diverse neighborhoods obtained using any GNN model. High expressive power is guaranteed by the diversity of different neighborhoods. For convenience, we denote with DS-GNN the GNN implemented under the proposed diverse sampling framework. Extensive experiments demonstrate the proposed method consistently improve the performance of base GNN models. The proposed framework is applicable to any GNN models and thus is general for improving the expressive power of GNNs.

2. NOTATIONS AND PRELIMINARIES

We first introduce the general framework of GNNs.



Figure1: The motivation of DS-GNN: constructing multiple sampled graphs rather than complex layer-wise aggregation functions (the node with red circle is the central node and "FC" represents fully connected layer).

Fig.1illustrates the main idea of the proposed DS-GNN, and compare it with two representative methods, i.e.,GIN and PNA. Fig. 1 (a)  depicts the injective layer implemented via MLP or multiple aggregation functions, aggregating first-order neighboring nodes to obtain the representation of central node. Injective layer are stacked to achieve an overall injective function defined on rooted sub-graphs. On the contrary, DS-GNN does not follow the line of designing complicated aggregation functions in each layer. Instead, DS-GNN improve the expressive power of GNNs via obtaining diverse rooted sub-graphs for each node. Specifically, we sample nodes multiple times on the entire input graph based on diverse sampling, and obtain multiple sampled sub-graphs for each node. After diverse sampling, we leverage the shared GNN model to get the representation for the central node, including its high-order neighbors. In this way, each node is represented by a multi-set, consisting of the representations obtained from different sampled rooted sub-graphs. The final representation of central node is finally obtained via aggregating the representation of diverse neighborhoods.Finally, we use classical GNNs (i.e.,GCN (Kipf & Welling, 2016)  andGAT (Veličković et al., 2017)) as base models to evaluate the effectiveness of the proposed framework. Experiments are conducted at node-based multi-class classification task on three benchmark datasets and node-based multilabel classification task on a dataset collected in this paper. Extensive experiments demonstrate the proposed method consistently improve the performance of base GNN models. The proposed framework is applicable to any GNN models and thus is general for improving the expressive power of GNNs.

