BIGCN: A BI-DIRECTIONAL LOW-PASS FILTERING GRAPH NEURAL NETWORK

Abstract

Graph convolutional networks have achieved great success on graph-structured data. Many graph convolutional networks can be regarded as low-pass filters for graph signals. In this paper, we propose a new model, BiGCN, which represents a graph neural network as a bi-directional low-pass filter. Specifically, we not only consider the original graph structure information but also the latent correlation between features, thus BiGCN can filter the signals along with both the original graph and a latent feature-connection graph. Our model outperforms previous graph neural networks in the tasks of node classification and link prediction on most of the benchmark datasets, especially when we add noise to the node features.

1. INTRODUCTION

Graphs are important research objects in the field of machine learning as they are good carriers for structural data such as social networks and citation networks. Recently, graph neural networks (GNNs) received extensive attention due to their great performances in graph representation learning. A graph neural network takes node features and graph structure (e.g. adjacency matrix) as input, and embeds the graph into a lower-dimensional space. With the success of GNNs (Kipf & Welling, 2017; Veličković et al., 2017; Hamilton et al., 2017; Chen et al., 2018) in various domains, more and more efforts are focused on the reasons why GNNs are so powerful (Xu et al., 2019) . Li et al (Li et al., 2018) re-examined graph convolutional networks (GCNs) and connected it with Laplacian smoothing. NT and Maehara et al (NT & Maehara, 2019) revisited GCNs in terms of graph signal processing and explained that many graph convolutions can be considered as low-pass filters (e.g. (Kipf & Welling, 2017; Wu et al., 2019) ) which can capture low-frequency components and remove some feature noise by making connective nodes more similar. In fact, these findings are not new. Since its first appearance in Bruna et al. (2014) , spectral GCNs have been closely related to graph signal processing and denoising. The spectral graph convolutional operation is derived from Graph Fourier Transform, and the filter can be formulated as a function with respect to the graph Laplacian matrix, denoted as g(L). In general spectral GCNs, the forward function is: H (l+1) = σ(g(L)H (l) ). Kipf and Welling (Kipf & Welling, 2017) approximated g(L) using first-order Chebyshev polynomials, which can be simplified as multiplying the augmented normalized adjacency matrix to the feature matrix. Despite the efficiency, this first-order graph filter is found sensitive to changes in the graph signals and the underlying graph structure (Isufi et al., 2016; Bianchi et al., 2019) . For instance, on isolated nodes or small single components of the graph, their denoising effect is quite limited due to the lack of reliable neighbors. The potential incorrect structure information will also constrain the power of GCNs and cause more negative impacts with deeper layers. As noisy/incorrect information is inevitable in real-world graph data, more powerful and robust GCNs are needed to solve this problem. In this work, we propose a new graph neural network with more powerful denoising effects from the perspective of graph signal processing and higher fault tolerance to the graph structure. Different from image data, graph data usually has high dimensional features, and there may be some latent connection/correlation between each dimensions. Noting this, we take this connection information into account to offset the efforts of certain unreliable structure information, and remove extra noise by applying a smoothness assumption on such a "feature graph". Derived from the additional Laplacian smoothing regularization in this feature graph, we obtain a novel variant of spectral GCNs, named BiGCN, which contains low-pass graph filters for both the original graph and a latent feature connection graph in each convolution layer. Our model can extract low-frequency components from both the graphs, so it is more expressive than the original spectral GCN; and it removes the noise from two directions, so it is also more robust. We evaluate our model on two tasks: node classification and link prediction. In addition to the original graph data, in order to demonstrate the effectiveness of our model with respect to graph signal denoising and fault tolerance, we design three cases with noise/structure mistakes: randomly adding Gaussian noise with different variances to a certain percentage of nodes; adding different levels of Gaussian noise to the whole graph feature; and changing a certain percentage of connections. The remarkable performances of our model in these experiments verify our power and robustness on both clean data and noisy data. The main contributions of this work are summarized below. • We propose a new framework for the representation learning of graphs with node features. Instead of only considering the signals in the original graph, we take into account the feature correlations and make the model more robust. • We formulate our graph neural network based on Laplacian smoothing and derive a bidirectional low-pass graph filter using the Alternating Direction Method of Multipliers (ADMM) algorithm. • We set three cases to demonstrate the powerful denoising capacity and high fault tolerance of our model in tasks of node classification and link prediction.

2. RELATED WORK

We summarize the related work in the field of graph signal processing and denoising and recent work on spectral graph convolutional networks as follows.

2.1. GRAPH SIGNAL PROCESSING AND DENOISING

Graph-structured data is ubiquitous in the world. Graph signal processing (GSP) (Ortega et al., 2018) is intended for analyzing and processing the graph signals whose values are defined on the set of graph vertices. It can be seen as a bridge between classical signal processing and spectral graph theory. One line of the research in this area is the generalization of the Fourier transform to the graph domain and the development of powerful graph filters (Zhu & Rabbat, 2012; Isufi et al., 2016) . It can be applied to various tasks, such as representation learning and denoising (Chen et al., 2014) . More recently, the tools of GSP have been successfully used for the definition of spectral graph neural networks, making a strong connection between GSP and deep learning. In this work, we restart with the concepts from graph signal processing and define a new smoothing model for deep graph learning and graph denoising. It is worth mentioning that the concept of denoising/robustness in GSP is different from the defense/robustness against adversarial attacks (e.g. (Zügner & Günnemann, 2019) ), so we do not make comparisons with those models.

2.2. SPECTRAL GRAPH CONVOLUTIONAL NETWORKS

Inspired by the success of convolutional neural networks in images and other Euclidean domains, the researcher also started to extend the power of deep learning to graphs. One of the earliest trends for defining the convolutional operation on graphs is the use of the Graph Fourier Transform and its definition in the spectral domain instead of the original spatial domain (Bruna et al., 2014) . Defferrard et al (Defferrard et al., 2016) proposed ChebyNet which defines a filter as Chebyshev polynomials of the diagonal matrix of eigenvalues, which can be exactly localized in the k-hop neighborhood. Later on, Kipf and Welling (Kipf & Welling, 2017) simplified the Chebyshev filters using the first-order polynomial filter, which led to the well-known graph convolutional network. Recently, many new spectral graph filters have been developed. For example, the rational auto-regressive moving average graph filters (ARMA) (Isufi et al., 2016; Bianchi et al., 2019) are proposed to enhance the modeling capacity of GNNs. Compared to the polynomial ones, ARMA filters are more robust and provide a more flexible graph frequency response. Feedback-looped filters (Wijesinghe & Wang, 2019) further

