SIMPLE SPECTRAL GRAPH CONVOLUTION FROM AN OPTIMIZATION PERSPECTIVE

Abstract

Recent studies on SGC, PageRank and S 2 GC have demonstrated that several graph diffusion techniques are straightforward, quick, and effective for tasks in the graph domain like node classification. Even though these techniques do not even need labels, they can nevertheless produce more discriminating features than raw attributes for downstream tasks with different classifiers. These methods are data-independent and thus primarily rely on some empirical parameters on polynomial bases (e.g., Monomial and Chebyshev), which ignore the homophily of graphs and the attribute distribution. They are more insensitive to heterophilous graphs due to the low-pass filtering. Although there are many approaches focusing on GNNs based on heterophilous graphs, these approaches are dependent on label information to learn model parameters. In this paper, we study the question: are labels a necessity for GNNs with heterophilous graphs? Based on this question, we propose a framework of self-representation on graphs related to the Least Squares problem. Specifically, we use Generalized Minimum RESidual (GMRES) method, which finds the least squares solution over Krylov subspaces. In theoretical analysis, without label information, we enjoy better features with graph convolution. The proposed method, like previous data-independent methods, is not a deep model and is, therefore, quick, scalable, and simple. We also show performance guarantees for models on real and synthetic data. On a benchmark of real-world datasets, empirically, our method is competitive with existing deep models for node classification.

1. INTRODUCTION

With the development of deep learning, CNNs have been widely used in different applications. A convolutional neural network (CNN) is exploits the shift-invariance, local connectivity, and compositionality of image data. As a result, CNNs extract meaningful local features for various imagerelated problems. Although CNNs effectively capture hidden patterns on the Euclidean grid, there is an increasing number of applications where data is represented in the form of non-Euclidean grid, e.g. in the graph domain. GNNs redefine the convolution on the graph in two different ways: spatial and spectral. Spatialbased methods decompose the convolution operation into an aggregation function and a transformation function. The aggregation function is used to aggregate neighbourhood node information by the mean function, which is somewhat similar to the box filter in traditional image processing. Some representative methods in this category are Message Passing Neural Networks (MPNN (Gilmer et al., 2017) ), GraphSAGE (Hamilton et al., 2017) , GAT (Veličković et al., 2017) , etc. Spectral methods are based on Graph Fourier Transformation (GFT). They try to learn a filtering function on the eigenvalues (or graph kernel, heat kernel, etc.) These methods usually use approximations in order to simplify the amount of computation, e.g. Chebyshev polynomials and Monomial polynomials are used by ChebNet (Defferrard et al., 2016) ), GDC (Klicpera et al., 2019) , SGC (Wu et al., 2019) , S 2 GC (Zhu & Koniusz, 2021) . Although spatial and spectral methods effectively extend the convolution operator to the graph domain, they usually suffer from oversmoothing on heterophily graph because they follow the homophily assumption, thus severely affect the node classification task as shown in Figure 1 . However, graphs are not always homophilic: they show the opposite property in some connected node groups. This makes it harder for existing homophilic GNNs to learn from general graphstructured data, which leads to a significant drop in performance on heterophilous graphs. There are many GNNs for a graph with heterophily. Their motivation mainly focuses on improving feature propagation and features transformation. Non-local neighbor extension usually is used for incorporating high-order neighbor information (Abu-El-Haija et al., 2019; Zhu et al., 2020; Jin et al., 2021) or discovering potential neighbours (Liu et al., 2021; Yang et al., 2021; Zheng et al., 2022) . Adaptive message aggregation is a good way to reduce the heterophilous edges (Veličković et al., 2017; Suresh et al., 2021) . Inter-layer combination provide a more flexible way to learn graph convolution (Xu et al., 2018; Zhu et al., 2020; Chien et al., 2021) . However, all of these approaches are designed for semi-supervised node classification, which is usually transductive (labels for training). In this paper, first we review the connection between GNNs and the Label Propagation (LP) with Laplacian regularization (Zhou et al., 2003) . The closed-form solution only depends on a parameter balancing smoothing and fitting error. This results in low-pass filter methods for homophilious graphs such as PageRank and S 2 GC, which cannot work well on heterophilous graphs. Based on the Taylor expansion of the closed form solution, we reformulate label propagation with Laplacian regularization to Residual Minimization in Krylov subspace. We further generalize the residual minimization in the Krylov subspace into a more generalized Polynomial Approximation. Then we discuss other possible bases such as Chebshev polynomials. In theoretical analysis, we try to explore whether high-order (second-order in this paper) or multi-scale graph convolutions are able to improve the performance given raw attributes without labels. In experiments with synthetic data, we show performance in line with our theoretical expectations. On the real-world benchmarks, our method is competitive with other graph convolution techniques in homophilous graphs and outperforms them (even some GNNs methods with transductive learning) on heterophilous graphs. Our contributions are: 1.) We reveal the labels are not necessary for graph neural networks on heterophilous graphs. The linear graph convolution is powerful on heterophilous graphs and homophilous graph, and outperforms GNNs for heterophilous graphs on semi-supervised node classification. 2.) We propose a framework of Feature (or Label) Propagation by parameterizing spectral graph convolution as residual minimization in Krylov subspace. We further reformulate residual minimization problem into Polymonimial Approximation, which can yield Chebshev and Berstein bases to overcome the Runge phenomenon. 3.) In theory, we prove second-order graph convolution is better than first-order graph convolution on heterophilous graphs and multi-scale (single and second-order) can provide better results with some combinations of parameters. 4.) Compared with other methods of label-dependent GNNs under heterophily, our method is competitive in real-world benchmarks. The proposed method outperforms other low-pass graph convolution without learning.



Figure 1: Results on the contextual SBM using SGC, S 2 GC and PR (PageRank) with number of hops K = 2, 8. 'Raw' shows the error when no filtering method is applied. All methods only work well in homophilous networks.

