GRAPH COARSENING WITH NEURAL NETWORKS

Abstract

As large-scale graphs become increasingly more prevalent, it poses significant computational challenges to process, extract and analyze large graph data. Graph coarsening is one popular technique to reduce the size of a graph while maintaining essential properties. Despite rich graph coarsening literature, there is only limited exploration of data-driven methods in the field. In this work, we leverage the recent progress of deep learning on graphs for graph coarsening. We first propose a framework for measuring the quality of coarsening algorithm and show that depending on the goal, we need to carefully choose the Laplace operator on the coarse graph and associated projection/lift operators. Motivated by the observation that the current choice of edge weight for the coarse graph may be suboptimal, we parametrize the weight assignment map with graph neural networks and train it to improve the coarsening quality in an unsupervised way. Through extensive experiments on both synthetic and real networks, we demonstrate that our method significantly improves common graph coarsening methods under various metrics, reduction ratios, graph sizes, and graph types. It generalizes to graphs of larger size (25× of training graphs), is adaptive to different losses (differentiable and non-differentiable), and scales to much larger graphs than previous work.

1. INTRODUCTION

Many complex structures can be modeled by graphs, such as social networks, molecular graphs, biological protein-protein interaction networks, knowledge graphs, and recommender systems. As large scale-graphs become increasingly ubiquitous in various applications, they pose significant computational challenges to process, extract and analyze information. It is therefore natural to look for ways to simplify the graph while preserving the properties of interest. There are two major ways to simplify graphs. First, one may reduce the number of edges, known as graph edge sparsification. It is known that pairwise distance (spanner), graph cut (cut sparsifier), eigenvalues (spectral sparsifier) can be approximately maintained via removing edges. A key result (Spielman & Teng, 2004) in the spectral sparsification is that any dense graph of size N can be sparsified to O(N log c N/ 2 ) edges in nearly linear time using a simple randomized algorithm based on the effective resistance. Alternatively, one could also reduce the number of nodes to a subset of the original node set. The first challenge here is how to choose the topology (edge set) of the smaller graph spanned by the sparsified node set. On the extreme, one can take the complete graph spanned by the sampled nodes. However, its dense structure prohibits easy interpretation and poses computational overhead for setting the Θ(n 2 ) weights of edges. This paper focuses on graph coarsening, which reduces the number of nodes by contracting disjoint sets of connected vertices. The original idea dates back to the algebraic multigrid literature (Ruge & Stüben, 1987) and has found various applications in graph partitioning (Hendrickson & Leland, 1995; Karypis & Kumar, 1998; Kushnir et al., 2006 ), visualization (Harel & Koren, 2000; Hu, 2005; Walshaw, 2000) and machine learning (Lafon & Lee, 2006; Gavish et al., 2010; Shuman et al., 2015) . However, most existing graph coarsening algorithms come with two restrictions. First, they are prespecified and not adapted to specific data nor different goals. Second, most coarsening algorithms set the edge weights of the coarse graph equal to the sum of weights of crossing edges in the original graph. This means the weights of the coarse graph is determined by the coarsening algorithm (of the vertex set), leaving no room for adjustment. With the two observations above, we aim to develop a data-driven approach to better assigning weights for the coarse graph depending on specific goals at hand. We will leverage the recent progress of deep learning on graphs to develop a framework to learn to assign edge weights in an unsupervised manner from a collection of input (small) graphs. This learned weight-assignment map can then be applied to new graphs (of potentially much larger sizes). In particular, our contributions are threefold. • First, depending on the quantity of interest F (such as the quadratic form w.r.t. Laplace operator), one has to carefully choose projection/lift operator to relate quantities defined on graphs of different sizes. We formulate this as the invariance of F under lift map, and provide three cases of projection/lift map as well as the corresponding operators on the coarse graph. Interestingly, those operators all can be seen as the special cases of doubly-weighted Laplace operators on coarse graphs (Horak & Jost, 2013). • Second, we are the first to propose and develop a framework to learn the edge weights of the coarse graphs via graph neural networks (GNN) in an unsupervised manner. We show convincing results both theoretically and empirically that changing the weights is crucial to improve the quality of coarse graphs. et al., 2015; Xie & Grossman, 2018) and physics simulation (Sanchez-Gonzalez et al., 2020) . Deep generative model for graphs. To generative realistic graphs such as molecules and parse trees, various approaches have been taken to model complex distributions over structures and attributes, such as variational autoencoder (Simonovsky & Komodakis, 2018; Ma et al., 2018) , generative adversarial networks (GAN) (De Cao & Kipf, 2018; Zhou et al., 2019) , deep autoregressive model (Liao et al., 2019; You et al., 2018b; Li et al., 2018a) , and reinforcement learning type approach (You et al., 2018a) . Zhou et al. (2019) proposes a GAN-based framework to preserve the hierarchical community structure via algebraic multigrid method during the generation process. However, different from our approach, the coarse graphs in Zhou et al. ( 2019) are not learned.



The algorithm runs in O(M.polylogN ) time, where M and N are the numbers of edges and vertices.



• Third, through extensive experiments on both synthetic graphs and real networks, we demonstrate that our method GOREN significantly improves common graph coarsening methods under different evaluation metrics, reduction ratios, graph sizes, and graph types. It generalizes to graphs of larger size (than the training graphs), adapts to different losses (so as to preserve different properties of original graphs), and scales to much larger graphs than what previous work can handle. Even for losses that are not differentiable w.r.t the weights of the coarse graph, we show training networks with a differentiable auxiliary loss still improves the result.

