DISTRIBUTIONAL SIGNALS FOR NODE CLASSIFICA-TION IN GRAPH NEURAL NETWORKS

Abstract

In graph neural networks (GNNs), both node features and labels are examples of graph signals, a key notion in graph signal processing (GSP). While it is common in GSP to impose signal smoothness constraints in learning and estimation tasks, it is unclear how this can be done for discrete node labels. We bridge this gap by introducing the concept of distributional graph signals. In our framework, we work with the distributions of node labels instead of their values and propose notions of smoothness and non-uniformity of such distributional graph signals. We then propose a general regularization method for GNNs that allows us to encode distributional smoothness and non-uniformity of the model output in semisupervised node classification tasks. Numerical experiments demonstrate that our method can significantly improve the performance of most base GNN models in different problem settings.

1. INTRODUCTION

We consider the semi-supervised node classification problem (Kipf & Welling, 2017) that determines class labels of nodes in graphs given sample observations and possibly node features. Numerous graph neural network (GNN) models have been proposed to tackle this problem. One of the first models is the graph convolutional network (GCN) (Defferrard et al., 2016) . Interpreted geometrically, a GCN aggregates information such as node features from the neighborhood of each node of the graph. Algebraically, this process is equivalent to applying a graph convolution filter to node feature vectors. Subsequently, many GNN models with different considerations are introduced. Popular models include the graph attention network (GAT) (Velicković et al., 2018) that learns weights between pairs of nodes during aggregation, and the hyperbolic graph convolutional neural network (HGCN) (Chami et al., 2019) that considers embedding of nodes of a graph in a hyperbolic space instead of a Euclidean space. For inductive learning, GraphSAGE (Hamilton et al., 2017) is proposed to generate low-dimensional vector representations for nodes that are useful for graphs with rich node attribute information. While new models draw inspiration from GCN, GCN itself is built upon the foundation of graph signal processing (GSP). GSP is a signal processing framework that handles graph-structured data (Shuman et al., 2013; Ortega et al., 2018; Ji & Tay, 2019) . A graph signal is a vector with each component corresponding to a node of a graph. Examples include node features and node labels. Moreover, convolutions used in models such as GCN are special cases of convolution filters in GSP (Shuman et al., 2013) . All these show the close connections between GSP theory and GNNs. In GSP, signal smoothness (over the graph) is widely used to regularize inference tasks. Intuitively, a signal is smooth if its values are similar at each pair of nodes connected by an edge. One popular way to formally define signal smoothness is to use the Laplacian quadratic form. There are numerous GSP tools that leverage a smooth prior of the graph signals. For example, Laplacian (Tikhonov) regularization is proposed for noise removal in Shuman et al. (2013) and signal interpolation (Narang et al., 2013) . In Chen et al. (2015) , it is used in graph signal in-painting and anomaly detection. In Kalofolias (2016), the same technique is used for graph topology inference. 



However, for GNNs, it is remarked in Yang et al. (2021, Section 4.1.2) that "graph Laplacian regularization can hardly provide extra information that existing GNNs cannot capture". Therefore, a regularization scheme based on feature propagation is proposed. It is demonstrated to be effective by comparing with other methods such as Feng et al. (2021) and Deng & Zhu (2019) based on

