FOSR: FIRST-ORDER SPECTRAL REWIRING FOR ADDRESSING OVERSQUASHING IN GNNS

Abstract

Graph neural networks (GNNs) are able to leverage the structure of graph data by passing messages along the edges of the graph. While this allows GNNs to learn features depending on the graph structure, for certain graph topologies it leads to inefficient information propagation and a problem known as oversquashing. This has recently been linked with the curvature and spectral gap of the graph. On the other hand, adding edges to the message-passing graph can lead to increasingly similar node representations and a problem known as oversmoothing. We propose a computationally efficient algorithm that prevents oversquashing by systematically adding edges to the graph based on spectral expansion. We combine this with a relational architecture, which lets the GNN preserve the original graph structure and provably prevents oversmoothing. We find experimentally that our algorithm outperforms existing graph rewiring methods in several graph classification tasks.

1. INTRODUCTION

Graph neural networks (GNNs) (Gori et al., 2005; Scarselli et al., 2008) are a broad class of models which process graph-structured data by passing messages between nodes of the graph. Due to the versatility of graphs, GNNs have been applied to a variety of domains, such as chemistry, social networks, knowledge graphs, and recommendation systems (Zhou et al., 2020; Wu et al., 2020) . GNNs broadly follow a message-passing framework, meaning that each layer of the GNN aggregates the representations of a node and its neighbors, and transforms these features into a new representation for that node. The aggregation function used by the GNN layer is taken to be locally permutationinvariant, since the ordering of the neighbors of a node is arbitrary, and its specific form is a key component of the GNN architecture; varying it gives rise to several common GNN variants (Kipf and Welling, 2017; Veličković et al., 2018; Li et al., 2015; Hamilton et al., 2017; Xu et al., 2019) . The output of a GNN can be used for tasks such as graph classification or node classification. Although GNNs are successful in computing dependencies between nodes of a graph, they have been found to suffer from a limited capacity to capture long-range interactions. For a fixed graph, this is caused by a variety of problems depending on the number of layers in the GNN. Since graph convolutions are local operations, a GNN with a small number of layers can only provide a node with information from nodes close to itself. For a GNN with l layers, the receptive field of a node (the set of nodes it receives messages from) is exactly the ball of radius l about the node. For small values of l, this results in "underreaching", and directly limits which functions the GNN can represent. On a related note, the functions representable by GNNs with l layers are limited to those computable by l steps of the Weisfeiler-Lehman (WL) graph isomorphism test (Morris et al., 2019; Xu et al., 2019; Barceló et al., 2020) . On the other hand, increasing the number of layers leads to its own set of problems. In contrast to other architectures that benefit from the expressivity of deeper networks, GNNs experience a decrease in accuracy as the number of layers increases (Li et al., 2018; Chen et al., 2020) . This phenomenon has partly been attributed to "oversmoothing", where repeated graph convolutions eventually render node features indistinguishable (Li et al., 2018; Oono and Suzuki, 2020; Cai and Wang, 2020; Zhao and Akoglu, 2020; Rong et al., 2020; Di Giovanni et al., 2022) . Separate from oversmoothing is the problem of "oversquashing" first pointed out by Alon and Yahav (2021) . As the number of layers of a GNN increases, information from (potentially) exponentiallygrowing receptive fields need to be concurrently propagated at each message-passing step. This leads

