EFFECTS OF GRAPH CONVOLUTIONS IN MULTI-LAYER NETWORKS

Abstract

Graph Convolutional Networks (GCNs) are one of the most popular architectures that are used to solve classification problems accompanied by graphical information. We present a rigorous theoretical understanding of the effects of graph convolutions in multi-layer networks. We study these effects through the node classification problem of a non-linearly separable Gaussian mixture model coupled with a stochastic block model. First, we show that a single graph convolution expands the regime of the distance between the means where multi-layer networks can classify the data by a factor of at least 1/ 4 √ deg, where deg denotes the expected degree of a node. Second, we show that with a slightly stronger graph density, two graph convolutions improve this factor to at least 1/ 4 √ n, where n is the number of nodes in the graph. Finally, we provide both theoretical and empirical insights into the performance of graph convolutions placed in different combinations among the layers of a neural network, concluding that the performance is mutually similar for all combinations of the placement. We present extensive experiments on both synthetic and real-world data that illustrate our results.

1. INTRODUCTION

A large amount of interesting data and the practical challenges associated with them are defined in the setting where entities have attributes as well as information about mutual relationships. Traditional classification models have been extended to capture such relational information through graphs (Hamilton, 2020) , where each node has individual attributes and the edges of the graph capture the relationships among the nodes. A variety of applications characterized by this type of graph-structured data include works in the areas of social analysis (Backstrom & Leskovec, 2011 ), recommendation systems (Ying et al., 2018 ), computer vision (Monti et al., 2017) , study of the properties of chemical compounds (Gilmer et al., 2017; Scarselli et al., 2009) , statistical physics (Bapst et al., 2020; Battaglia et al., 2016) , and financial forensics (Zhang et al., 2017; Weber et al., 2019) . The most popular learning models for relational data use graph convolutions (Kipf & Welling, 2017) , where the idea is to aggregate the attributes of the set of neighbours of a node instead of only utilizing its own attributes. Despite several empirical studies of various GCN-type models (Chen et al., 2019; Ma et al., 2022) that demonstrate an improvement in the performance of traditional classification methods such as MLPs, there has been limited progress in the theoretical understanding of the benefits of graph convolutions in multi-layer networks in terms of improving node classification tasks. Related work. The capacity of a graph convolution for one-layer networks is studied in Baranwal et al. (2021) , along with its out-of-distribution (OoD) generalization potential. A more recent work (Wu et al., 2022) formulates the node-level OoD problem, and develops a learning method that facilitates GNNs to leverage invariance principles for prediction. In Gasteiger et al. (2019) , the authors utilize a propagation scheme based on personalized PageRank to construct a model that outperforms several GCN-like methods for semi-supervised classification. Through their algorithm, APPNP, they show that placing power iterations at the last layer of an MLP achieves state of the art performance. Our results align with this observation.

