A UNIFIED FRAMEWORK FOR CONVOLUTION-BASED GRAPH NEURAL NETWORKS Anonymous authors Paper under double-blind review

Abstract

Graph Convolutional Networks (GCNs) have attracted a lot of research interest in the machine learning community in recent years. Although many variants have been proposed, we still lack a systematic view of different GCN models and deep understanding of the relations among them. In this paper, we take a step forward to establish a unified framework for convolution-based graph neural networks, by formulating the basic graph convolution operation as an optimization problem in the graph Fourier space. Under this framework, a variety of popular GCN models, including the vanilla-GCNs, attention-based GCNs and topology-based GCNs, can be interpreted as a same optimization problem but with different carefully designed regularizers. This novel perspective enables a better understanding of the similarities and differences among many widely used GCNs, and may inspire new approaches for designing better models. As a showcase, we also present a novel regularization technique under the proposed framework to tackle the oversmoothing problem in graph convolution. The effectiveness of the newly designed model is validated empirically.

1. INTRODUCTION

Recent years have witnessed a fast development in graph processing by generalizing convolution operation to graph-structured data, which is known as Graph Convolutional Networks (GCNs) (Kipf & Welling, 2017) . Due to the great success, numerous variants of GCNs have been developed and extensively adopted in the field of social network analysis (Hamilton et al., 2017; Wu et al., 2019a; Veličković et al., 2018) , biology (Zitnik et al., 2018) , transportation forecasting (Li et al., 2017) and natural language processing (Wu et al., 2019b; Yao et al., 2019) . Inspired by GCN, a wide variety of convolution-based graph learning approaches are proposed to enhance the generalization performance of graph neural networks. Several research aim to achieve higher expressiveness by exploring higher-order information or introducing additional learning mechanisms like attention modules. Although proposed from different perspectives, their exist some connections between these approaches. For example, attention-based GCNs like GAT (Veličković et al., 2018) and AGNN (Thekumparampil et al., 2018) share the similar intention by adjusting the adjacency matrix with a function of edge and node features. Similarly, TAGCN (Du et al., 2017) and MixHop (Kapoor et al., 2019) can be viewed as particular instances of PPNP (Klicpera et al., 2018) under certain approximation. However, the relations among these graph learning models are rarely studied and the comparisons are still limited in analyzing generalization performances on public datasets. As a consequence, we still lack a systematic view of different GCN models and deep understanding of the relations among them. In this paper, we resort to the techniques in graph signal processing and attempt to understand GCN-based approaches from a general perspective. Specifically, we present a unified graph convolution framework by establishing graph convolution operations with optimization problems in the graph Fourier domain. We consider a Laplacian regularized least squares optimization problem and show that most of the convolution-based approaches can be interpreted in this framework by adding carefully designed regularizers. Besides vanilla GCNs, we also extend our framework to formulating non-convolutional operations (Xu et al., 2018a; Hamilton et al., 2017) , attention-based GCNs (Veličković et al., 2018; Thekumparampil et al., 2018) and topology-based GCNs (Klicpera et al., 2018; Kapoor et al., 2019) , which cover a large fraction of the state-of-the-art graph learning ap-proaches. This novel perspective provides a re-interpretation of graph convolution operations and enables a better understanding of the similarities and differences among many widely used GCNs, and may inspire new approaches for designing better models. As a conclusion, we summarize our contributions as follow: 1. We introduce a unified framework for convolution-based graph neural networks and interpret various convolution filters as carefully designed regularizers in the graph Fourier domain, which provides a general methodology for evaluating and relating different graph learning modules. 2. Based on the proposed framework, we provide new insights on understanding the limitations of GCNs and show new directions to tackle common problems and improve the generalization performance of current graph neural networks in the graph Fourier domain. Additionally, the unified framework can serve as a once-for-all platform for expert-designed modules on convolution-based approaches, where newly designed modules can be easily implemented on other networks as a plugin module with trivial adaptations. We believe that our framework can provide convenience for designing new graph learning modules and searching for better combinations. 3. As a showcase, we present a novel regularization technique under the proposed framework to alleviate the oversmoothing problem in graph representation learning. As shown in Section 4, the newly designed regularizer can be implemented on several convolution-based networks and effectively improve the generalization performance of graph learning models.

2. PRELIMINARY

We start with an overview of the basic concepts of graph signal processing. Let G = (V, A) denote a graph with node feature vectors where V represents the vertex set consisting of nodes {v 1 , v 2 , . . . , v N } and A = (a ij ) ∈ R N ×N is the adjacency matrix implying the connectivity between nodes in the graph. Let D = diag(d(1), . . . , d(N )) ∈ R N ×N be the degree matrix of A where d(i) = j∈V a ij is the degree of vertex i. Then, L = D -A is the combinatorial Laplacian and L = I -D (-1/2) AD (-1/2) is the normalized Laplacian of G. Additionally, we let Ã = A + I and D = D + I denote the augmented adjacency and degree matrices with added self-loops. Then Lsym = I -D-1/2 Ã D-1/2 ( Ãsym = D-1/2 Ã D-1/2 ) and Lrw = I -D-1 Ã ( Ãrw = D-1 Ã) are the augmented symmetric normalized and random walk normalized Laplacian (augmented adjacency matrices) of G, respectively. Let x ∈ R N be a signal on the vertices of the graph. The spectral convolution is defined as a function of a filter g θ parameterized in the Fourier domain (Kipf & Welling, 2017) : g θ x = U g θ (Λ)U T x, where U and Λ are the eigenvectors and eigenvalues of the normalized Laplacian L. Also, we follow Hoang & Maehara (2019) and define the variation ∆ and D-inner product as: ∆(x) = i,j∈V a ij (x(i) -x(j)) 2 = x T Lx, (x, y) D = i∈V (d(i) + 1)x(i)y(i) = x T Dy, which specifies the smoothness and importance of the signal respectively.

3. UNIFIED GRAPH CONVOLUTION FRAMEWORK

With the success of GCNs, a wide variety of convolution-based approaches are proposed which progressively enhance the expressive power and generalization performance of graph neural networks. Despite the effectiveness of GCN and its derivatives on specific tasks, there still lack a comprehensive understanding on the relations and differences among various graph learning modules. Graph signal processing is a powerful technique which has been adopted in several graph learning researches (Kipf & Welling, 2017; Hoang & Maehara, 2019; Zhao & Akoglu, 2019) . However, existing researches mainly focus on analyzing the properties of GCNs while ignore the connections between different graph learning modules. Innovatively, in this work, we consider interpreting convolution-based approaches from a general perspective with graph signal processing techniques.

