MEGRAPH: GRAPH REPRESENTATION LEARNING ON CONNECTED MULTI-SCALE GRAPHS

Abstract

We present MeGraph, a novel network architecture for graph-structured data. Given any input graph, we create multi-scale graphs using graph pooling. Then, we connect them into a mega graph by bridging inter-graph edges according to the graph pooling results. Instead of universally stacking graph convolutions over the mega graph, we apply general graph convolutions over intra-graph edges, while the convolutions over inter-graph edges follow a bidirectional pathway to deliver the information along the hierarchy for one turn. Graph convolution and graph pooling are two core elementary operations of MeGraph. In our implementation, we adopt the graph full network (GFuN) and propose the stridden edge contraction pooling (S-EdgePool) with adjustable pooling ratio, which are extended from conventional graph convolution and edge contraction pooling. The MeGraph model enables information exchange across multi-scale graphs, repeatedly, for deeper understanding of wide range correlations in graphs. This distinguishes MeGraph from many recent hierarchical graph neural networks like Graph U-Nets. We conduct comprehensive empirical studies on tens of public datasets, in which we observe consistent performance gains comparing to baselines. Specifically, we establish 5 new graph theory benchmark tasks that require long-term inference and deduction to solve, where MeGraph demonstrates dominated performance compared with popular graph neural networks.

1. INTRODUCTION

In real-world applications, many types of data can be naturally organized as graphs, such as social networks, traffic networks and biological data. Recent advances in graph neural networks (GNNs) have inherited the great success of convolutional neural networks (CNNs) from images to deal with graph-structured data. Popular methods include the GCN (Kipf & Welling, 2016) , GIN (Xu et al., 2018) , GAT (Vaswani et al., 2017) and Graph U-Nets (Gao & Ji, 2019), etc. Generally, the development of both CNNs and GNNs is co-evolved, and most effective experiences identified in CNNs are also helpful for GNNs. For example, we have witnessed coupled networks for image and graph data, like CNN vs. GCN, attentional CNN vs. GAT (Vaswani et al., 2017) , and U-Net (Ronneberger et al., 2015) vs. Graph U-Net (Gao & Ji, 2019), etc. Instead of directly transferring advances in CNNs to GNNs, we investigate inherent characteristics in graphs and design a new architecture accordingly. We use the following example to motivate the story. Consider the problem of identifying the shortest path in a chain graph. Using normal graph convolutions, we have to stack multiple graph convolutional layers to enlarge the receptive field to cover the source and the destination nodes. However, if the architecture could infer from a larger scope, e.g., constructing multi-scale graphs in a hierarchy, the shortest path is easier to be estimated by aggregating and delivering information from multi-level scopes. In addition, a single turn of information aggregation or delivery over the hierarchical structure might not be sufficient, because estimation should be refined and deduced over and over again to achieve sure conclusions. That is, the architecture has to repeat the information exchange across the hierarchy multiple times to identify the shortest path for sure. This example will be investigated in our experiment in Section 4. In fact, there have been several recent GNNs working on a hierarchical graph structure. The Graph U-Nets (Gao & Ji, 2019) forms a hierarchy by downsampling the graph with iterative convolutions and top-k pooling, and then upsampling the pooled graph with iterative convolutions and unpooling operators. However, the U-shaped net only propagates the information for a single turn. The GraphFPN (Zhao et al., 2021) builds mappings between the image and graph feature pyramids according to the superpixel hierarchy, and it applies GNN layers on the hierarchical graph to exchange 1

