AFFINITY-AWARE GRAPH NETWORKS

Abstract

Graph Neural Networks (GNNs) have emerged as a powerful technique for learning on relational data. Owing to the relatively limited number of message passing steps they perform-and hence a smaller receptive field-there has been significant interest in improving their expressivity by incorporating structural aspects of the underlying graph. In this paper, we explore the use of affinity measures as features in graph neural networks, in particular measures arising from random walks, including effective resistance, hitting and commute times. We propose message passing networks based on these features and evaluate their performance on a variety of node and graph property prediction tasks. Our architecture has low computational complexity, while our features are invariant to the permutations of the underlying graph. The measures we compute allow the network to exploit the connectivity properties of the graph, thereby allowing us to outperform relevant benchmarks for a wide variety of tasks, often with significantly fewer message passing steps. On one of the largest publicly available graph regression datasets, OGB-LSC-PCQM4Mv1, we obtain the best known single-model validation MAE at the time of writing.



Despite the predictive power of GNNs, it is known that the expressive power of standard GNNs is limited by the 1-Weisfeiler-Lehman (1-WL) test Xu et al. (2018) . Intuitively, GNNs possess at most the same power in terms of distinguishing between non-isomorphic (sub-)graphs, while having the added benefit of adapting to the given data distribution. For some architectures, two nodes with different local structures have the same computational graph, thus thwarting distinguishability in a standard GNN. Even though some attempts have been made to address this limitation with higherorder GNNs Morris et al. (2019) , most traditional GNN architectures fail to distinguish between such nodes. A common approach to improving the expressive power of GNNs involves encoding richer structural/positional properties. For example, distance-based approaches form the basis for works such as Position-aware Graph Neural Networks You et al. (2019) , which capture positions/locations of nodes with respect to a set of anchor nodes, as well as Distance Encoding Networks Li et al. ( 2020), which use the first few powers of the normalized adjacency matrix as node features associated with a set of target nodes. Here, we take an approach that is inspired by this line of work but departs from it in some crucial ways: we seek to capture both distance and connectivity information using general-purpose node and edge features without the need for specifying any anchor or target nodes.

Contributions:

We propose the use of affinity metrics as features in a graph neural network. Specifically, we consider statistics that arise from random walks in graphs, such as hitting time and commute time between pairs of vertices (see Sections 3.1 and 3.2). We present a means of incorporating these statistics as scalar edge features in a message passing neural network (MPNN) Gilmer et al. (2017) (see Section 3.4). In addition to these scalar features, we present richer vector-valued resistive embeddings (see Section 3.3), which can be incorporated as node or edge feature vectors in the network. Resistive embeddings are a natural way of embedding each node into Euclidean space such that the squared L 2 -distance between nodes recovers the commute time. We show that such embeddings can be efficiently approximated, even for larger graphs, using sketching and dimensionality reduction techniques (see Section 4). Moreover, we evaluate our networks on a number of benchmark datasets of diverse scales (see Section 5). First, we show that our networks outperform other baselines on the PNA dataset Corso et al. ( 2020), which includes 6 node and graph algorithmic tasks, showing the ability of affinity measures to exploit structural properties of graphs. We also evaluate the performance on a number of graph and node tasks for datasets in the Open Graph Benchmark (OGB) collection Hu et al. ( 2020), including molecular and citation graphs. In particular, our networks with scalar effective resistance edge features achieve the state of the art on the OGB-LSC PCQM4Mv1 dataset, which was featured in a KDD Cup 2021 competition for large scale graph representation learning. Finally, we provide intuition for why affinity-based measures are fundamentally different from aforementioned distance-based approaches (see Section 3.5) and bolster it with detailed theoretical and empirical results (see Appendix D) showing favorable results for affinity-based measures.

2. RELATED WORK

Our work builds upon a wealth of graph theoretic and graph representation learning works, while we focus on a supervised, inductive setting. Even though GNN architectures were originally classified as spectral or spatial, we abstain from this division as recent research has demonstrated some equivalence of the graph convolution process regardless of the choice of convolution kernels (Balcilar et al., 2021; Bronstein et al., 2021) . Spectrallymotivated methods are often theoretically founded on the eigendecomposition of the graph Laplacian matrix (or an approximation thereof) and, hence, corresponding convolutions capture different frequencies of the graph signal. Early works in this space include ChebNet Defferrard et al. ( 2016) and its more efficient 1-hop version by Kipf et al. (Kipf & Welling, 2017) , which offers a linear function on the graph Laplacian spectrum. Levie et al. (Levie et al., 2018) proposed CayleyNets, an alternative rational filter. Message passing neural networks (MPNNs) Gilmer et al. ( 2017) perform a transformation of node and edge representations before and after an arbitrary aggregator (e.g. sum). Graph attention networks (GATs) Veličković et al. (2018) aimed to augment the computations of GNNs by allowing graph nodes to "attend" differently to different edges, inspired by the success of transformers in NLP tasks. One of the most relevant works was proposed by Beaini et al. (Beaini et al., 2021) , i.e. directional graph networks (DGN). DGN uses the gradients of the low-frequency eigenvectors of the graph Laplacian, which are known to capture key information about the global structure of the graph and prove that the aggregators they construct using these gradients lead to more discriminative models than standard GNNs according to the 1-WL test. Prior work Morris et al. (2019) used higher-order (k-dimensional) GNNs, based on k-WL, and a hierarchical variant and proved theoretically and experimentally the improved expressivity in comparison to other models. Other notable works include Graph Isomorphism Networks (GINs) Xu et al. (2018) , which represent a simple, maximally-powerful GNN over discrete-featured inputs. This work also brought to light the expressivity limitations of GNNs. Hamilton et al. (Hamilton et al., 2017) proposed a method to constuct node representations by sampling a fixed-size neighborhood of each node, and then performing a specific aggregator over it, which led to impressive performance on large-scale inductive benchmarks. Bouritsas et al. (Bouritsas et al., 2020) use topologically-aware message passing to detect and count graph substructures, while Bodnar et al. (Bodnar et al., 2021) propose a messagepassing procedure on cell complexes motivated by a novel colour refinement algorithm to test their isomorphism which prove to be powerful for molecular benchmarks.



(GNNs) constitute a powerful tool for learning meaningful representations in non-Euclidean domains. GNN models have achieved significant successes in a wide variety of node prediction Hamilton et al. (2017); Luan et al. (2019), link prediction Zhang & Chen (2018); You et al. (2019), and graph prediction Duvenaud et al. (2015); Ying et al. (2019) tasks. These tasks naturally emerge in a wide range of applications, including autonomous driving Chen et al. (2019), neuroimaging Parisot et al. (2018), combinatorial optimization Gasse et al. (2019); Nair et al. (2020), and recommender systems Ying et al. (2018), while they have enabled significant scientific advances in the fields of biomedicine Wang et al. (2021a), structural biology Jumper et al. (2021), molecular chemistry Stokes et al. (2020) and physics Bapst et al. (2020).

