RED-GCN: REVISIT THE DEPTH OF GRAPH CONVOLU-TIONAL NETWORK

Abstract

Finding the proper depth d of a GNN that provides strong representation power has drawn significant attention, yet nonetheless largely remains an open problem for the graph learning community. Although noteworthy progress has been made, the depth or the number of layers of a corresponding GCN is realized by a series of graph convolution operations, which naturally makes d a positive integer (d ∈ N+). An interesting question is whether breaking the constraint of N+ by making d a real number (d ∈ R) can bring new insights into graph learning mechanisms. In this work, by redefining GCN's depth d as a trainable parameter continuously adjustable within (-∞, +∞), we open a new door of controlling its expressiveness on graph signal processing to model graph homophily/heterophily (nodes with similar/dissimilar labels/attributes tend to inter-connect). A simple and powerful GCN model RED-GCN, is proposed to retain the simplicity of GCN and meanwhile automatically search for the optimal d without the prior knowledge regarding whether the input graph is homophilic or heterophilic. Negative-valued d intrinsically enables high-pass frequency filtering functionality for graph heterophily. Variants extending the model flexibility/scalability are also developed. The theoretical feasibility of having a real-valued depth with explainable physical meanings is ensured via eigen-decomposition of the graph Laplacian and a properly designed transformation function from the perspective of functional calculus. Extensive experiments demonstrate the superiority of RED-GCN on node classification tasks for a variety of graphs. Furthermore, by introducing the concept of eigengraph, a novel graph augmentation method is obtained: the optimal d effectively generates a new topology through a properly weighted combination of eigengraphs, which dramatically boosts the performance even for a vanilla GCN.

1. INTRODUCTION

Graph convolutional network (GCN) (Kipf & Welling, 2016; Veličković et al., 2017; Hamilton et al., 2017) has exhibited great power in a variety of graph learning tasks, such as node classification (Kipf & Welling, 2016; Luan et al., 2019; 2022a) , link prediction (Zhang & Chen, 2018), community detection (Chen et al., 2020) , and many more. Since the representation power of GCN is largely determined by its depth, i.e., the number of graph convolution layers, tremendous research efforts have been made on finding the optimal depth that strengthens the model's ability for downstream tasks. Upon increasing the depth, the over-smoothing issue arises: a GCN's performance is deteriorated if its depth exceeds a uncertain threshold (Kipf & Welling, 2016) . It is unveiled in (Li et al., 2018) that a graph convolution operation is a special form of Laplacian smoothing (Taubin, 1995) . Thus, the similarity between the graph node embeddings grows with the depth so that these embeddings eventually become indistinguishable. Various techniques are developed to alleviate this issue, e.g., applying pairwise normalization can make distant nodes dissimilar (Zhao & Akoglu, 2019) , and dropping sampled edges during training slows down the growth of embedding smoothness as depth increases (Rong et al., 2019) . Other than the over-smoothing issue due to large GCN depth, another fundamental phenomenon widely existing in real-world graphs is homophily and heterophily. In a homophilic graph, nodes with similar labels or attributes tend to inter-connect, while in a heterophily graph, connected nodes usually have distinct labels or dissimilar attributes. Most graph neural networks (GNNs) are developed based on homophilic assumption (Yang et al., 2016) , while models able to perform well on heterophilic graphs often need special treatment and complex designs (Bianchi et al., 2021; Zhu et al., 2020) . Despite the achievements made by these methodologies, little correlation has been found between the adopted GNN model's depth and its capability of characterizing graph heterophily. For most GNNs, if not all, the depth needs to be manually set as a hyper-parameter before training, and finding the proper depth usually requires a considerable amount of trials or good prior knowledge of the graph dataset. Since the depth represents the number of graph convolution operations and naturally takes only positive integer values, little attention has been paid to the question whether a non-integer depth is realizable, and if yes, whether it is practically meaningful, and whether it can bring unique advantages to current graph learning mechanisms. This work revisits the GCN depth from spectral and spatial perspectives and explains the interdependencies between the following key ingredients in graph learning: the depth of a GCN, the spectrum of the graph signal, and the homophily/heterophily of the underlying graph. Firstly, through eigen-decomposition of the symmetrically normalized graph Laplacian, we present the correlation between graph homophily/heterophily and the eigenvector frequencies. Secondly, by introducing the concept of eigengraph, we show the graph topology is equivalent to a weighted linear combination of eigengraphs, and the weight values determine the GCN's capability of capturing homophilic/heterophilic graph signals. Thirdly, we reveal that the eigengraph weights can be controlled by GCN's depth, so that an automatically tunable depth parameter is needed to adjust the eigengraph weights into the designated distribution in match of the underlying graph homophily/heterophily. To realize the adaptive GCN depth, we extend its definition from a positive integer to an arbitrary real number with theoretical feasibility guarantees from functional calculus (Shah & Okutmuştur, 2020) . With a trainable depth parameter, we propose a simple and powerful model, Redefined Depth-GCN (ReD-GCN), with two variants. Extensive experiments demonstrate the automatically optimal depth searching ability, and it is found that negative-valued depth plays the key role in handling heterophilic graphs. Systematical investigation on the optimal depth is conducted in both spectral and spatial domains. It in turn inspires the development of a novel graph augmentation methodology. With clear geometric explanability, the augmented graph structure possesses supreme advantages over the raw input topology, especially for graphs with heterophily. The main contributions of this paper are summarized as following: • The interdependence between negative GCN depth and graph heterophily is discovered; In-depth geometric and spectral explanations are presented. • A novel problem of automatic GCN depth tuning for graph homophily/heterophily detection is formulated. To our best knowledge, this work presents the first trial to make GCN's depth trainable by redefining it on the real number domain. • A simple and powerful model RED-GCN with two variants (RED-GCN-S and RED-GCN-D) is proposed. A novel graph augmentation method is discussed. • Our model achieves superior performance on semi-supervised node classification tasks on 11 graph datasets.

2. PRELIMINARIES

Notations. We utilize bold uppercase letters for matrices (e.g., A), bold lowercase letters for column vectors (e.g., u) and lowercase letters for scalars (e.g., α). We use the superscript ⊤ for transpose of matrices and vectors (e.g., A ⊤ and u ⊤ ). An attributed undirected graph G = {A, X} contains an adjacency matrix A ∈ R n×n and an attribute matrix X ∈ R n×q with the number of nodes n and the dimension of node attributes q. D denotes the diagonal degree matrix of A. The adjacency matrix with self-loops is given by Ã = A + I (I is the identity matrix), and all variables derived from Ã are decorated with symbol ˜, e.g., D represents the diagonal degree matrix of Ã. M d stands for the d-th power of matrix M, while the parameter and node embedding matrices in the d-th layer of a GCN are denoted by W (d) and H (d) . Graph convolutional network (GCN) and simplified graph convolutional network (SGC). The layer-wise message-passing and aggregation of GCN (Kipf & Welling, 2016) is given by H (d+1) = σ( D-1 2 Ã D-1 2 H (d) W (d) ),

