ADAPTIVE STACKED GRAPH FILTER

Abstract

We study Graph Convolutional Networks (GCN) from the graph signal processing viewpoint by addressing a difference between learning graph filters with fullyconnected weights versus trainable polynomial coefficients. We find that by stacking graph filters with learnable polynomial parameters, we can build a highly adaptive and robust vertex classification model. Our treatment here relaxes the low-frequency (or equivalently, high homophily) assumptions in existing vertex classification models, resulting a more ubiquitous solution in terms of spectral properties. Empirically, by using only one hyper-parameter setting, our model achieves strong results on most benchmark datasets across the frequency spectrum.

1. INTRODUCTION

The semi-supervised vertex classification problem (Weston et al., 2012; Yang et al., 2016) in attributed graphs has become one of the most fundamental machine learning problems in recent years. This problem is often associated with its most popular recent solution, namely Graph Convolutional Networks (Kipf & Welling, 2017) . Since the GCN proposal, there has been a vast amount of research to improve its scalability (Hamilton et al., 2017; Chen et al., 2018; Wu et al., 2019) as well as performance (Liao et al., 2019; Li et al., 2019; Pei et al., 2020) . Existing vertex classification models often (implicitly) assume that the graph has large vertex homophily (Pei et al., 2020) , or equivalently, low-frequency property (Li et al., 2019; Wu et al., 2019) ; see Section 2.1 for graph frequency. However, this assumption is not true in general. For instance, let us take the Wisconsin dataset (Table 1 ), which captures a network of students, faculty, staff, courses, and projects. These categories naturally exhibit different frequency patternsfoot_0 . Connections between people are often low-frequency, while connections between topics and projects are often midrange. This problem becomes apparent as GCN-like models show low accuracies on this dataset; for example, see (Pei et al., 2020; Chen et al., 2020b; Liu et al., 2020) . This paper aims at establishing a GCN model for the vertex classification problem (Definition 1) that does not rely on any frequency assumption. Such a model can be applied to ubiquitous datasets without any hyper-parameter tuning for the graph structure. Contributions. By observing the relation between label frequency and performance of existing GCN-like models, we propose to learn the graph filters coefficients directly rather than learning the MLP part of a GCN-like layer. We use filter stacking to implement a trainable graph filter, which is capable of learning any filter function. Our stacked filter construction with novel learnable filter parameters is easy to implement, sufficiently expressive, and less sensitive to the filters' degree. By using only one hyper-parameter setting, we show that our model is more adaptive than existing work on a wide range of benchmark datasets. The rest of our paper is organized as follows. Section 2 introduces notations and analytical tools. Section 3 provides insights into the vertex classification problem and motivations to our model's design. Section 4 presents an implementation of our model. Section 5 summarizes related literature with a focus on graph filters and state-of-the-art models. Section 6 compares our model and other existing methods empirically. We also provide additional experimental results in Appendix A.



"Frequency" is an equivalent concept to "homophily" and will be explained in Section 1

