LATENT GRAPH INFERENCE USING PRODUCT MANIFOLDS

Abstract

Graph Neural Networks usually rely on the assumption that the graph topology is available to the network as well as optimal for the downstream task. Latent graph inference allows models to dynamically learn the intrinsic graph structure of problems where the connectivity patterns of data may not be directly accessible. In this work, we generalize the discrete Differentiable Graph Module (dDGM) for latent graph learning. The original dDGM architecture used the Euclidean plane to encode latent features based on which the latent graphs were generated. By incorporating Riemannian geometry into the model and generating more complex embedding spaces, we can improve the performance of the latent graph inference system. In particular, we propose a computationally tractable approach to produce product manifolds of constant curvature model spaces that can encode latent features of varying structure. The latent representations mapped onto the inferred product manifold are used to compute richer similarity measures that are leveraged by the latent graph learning model to obtain optimized latent graphs. Moreover, the curvature of the product manifold is learned during training alongside the rest of the network parameters and based on the downstream task, rather than it being a static embedding space. Our novel approach is tested on a wide range of datasets, and outperforms the original dDGM model.

1. INTRODUCTION

Graph Neural Networks (GNNs) have achieved state-of-the-art performance in a number of applications, from travel-time prediction (Derrow-Pinion et al. (2021) ) to antibiotic discovery (Stokes et al. (2020) ). They leverage the connectivity structure of graph data, which improves their performance in many applications as compared to traditional neural networks (Bronstein et al. (2017) ). Most current GNN architectures assume that the topology of the graph is given and fixed during training. Hence, they update the input node features, and sometimes edge features, but preserve the input graph topology. A substantial amount of research has focused on improving diffusion using different types of GNN layers. However, discovering an optimal graph topology that can help diffusion has only recently gained attention (Topping et al. (2021); Cosmo et al. (2020); Kazi et al. (2022) ). In many real-world applications, data can have some underlying but unknown graph structure, which we call a latent graph. That is, we may only be able to access a pointcloud of data. Nevertheless, this does not necessarily mean the data is not intrinsically related, and that its connectivity cannot be leveraged to make more accurate predictions. The vast majority of Geometric Deep Learning research so far has relied on human annotators or simplistic pre-processing algorithms to generate the graph structure to be passed to GNNs. Furthermore, in practice, even in settings where the correct graph is provided, it may often be suboptimal for the task at hand, and the GNN may benefit from rewiring (Topping et al. (2021) ). In this work, we drop the assumption that the graph adjacency matrix is given and study how to learn the latent graph in a fully-differentiable manner, using product manifolds, alongside the GNN diffusion layers. More elaborately, we incorporate Riemannian geometry to the discrete Differentiable Graph Module (dDGM) proposed by Kazi et al. (2022) . We show that it is possible and beneficial to encode latent features into more complex embedding spaces beyond the Euclidean plane used in the original work. In particular, we leverage the convenient mathematical properties of product manifolds to learn the curvature of the embedding space in a fully-differentiable manner. Contributions 1) We explain how to use model spaces of constant curvature for the embedding space. To do so, we outline a principled procedure to map Euclidean GNN output features to constant curvature model space manifolds with non-zero curvature: we use the hypersphere for spherical space and the hyperboloid for hyperbolic space. We also outline how to calculate distances between points in these spaces, which are then used by the dDGM sparse graph generating procedure to infer the edges of the latent graph. Unlike the original dDGM model which explored using the Poincaré ball with fixed curvature for modeling hyperbolic space, in this work we use hyperboloids of arbitrary negative curvature. 2) We show how to construct more complex embedding spaces that can encode latent data of varying structure using product manifolds of model spaces. The curvature of each model space composing the product manifold is learned in a fully-differentiable manner alongside the rest of the model parameters, and based on the downstream task performance. 3) We test our approach on 15 datasets which includes standard homophilic graph datasets, heterophilic graphs, large-scale graphs, molecular datasets, and datasets for other real-world applications such as brain imaging and aerospace engineering. 4) It has been shown that traditional GNN models, such as Graph Convolutional Networks (GCNs) (Kipf & Welling (2017)) and Graph Attention Networks (GATs) (Veličković et al. ( 2018)) struggle to achieve good performance in heterophilic datasets (Zhu et al. ( 2020)), since in fact homophily is used as an inductive bias by these models. Amongst other models, Sheaf Neural Networks (SNNs) (Hansen & Gebhart (2020); Bodnar et al. (2022) ; Barbero et al. (2022b; a) ) have been proposed to tackle this issue. We show that latent graph inference enables traditional GNN models to give good performance on heterophilic datasets without having to resort to sophisticated diffusion layers or model architectures such as SNNs. 5) To make this work accessible to the wider machine learning community, we have created a new PyTorch Geometric layer.

2. BACKGROUND

In this section we discuss relevant background for this work. We first provide a literature review regarding recent advances in latent graph inference using GNNs as well as related work on manifold learning and graph embedding. Next, we give an overview of the original Differentiable Graph Module (DGM) formulation, but we recommend referring to Kazi et al. (2022) for further details. 2019)). Note, however, that this area of research focuses on improving an already existing graph which may be suboptimal for the downstream task. This paper is more directly related to work that addresses how to learn the graph topology dynamically, instead of assuming a fixed graph at the start of training. When the underlying connectivity structure is unknown, architectures such as transformers (Vaswani et al. (2017) ) and attentional multi-agent predictive models (Hoshen (2017)), simply assume the graph to be fully-connected, but this can become hard to scale to large graphs. Generating sparse graphs can result in more computationally tractable solutions (Fetaya et al. (2018) ) and avoid over-smoothing (Chen et al. (2020a) ). For this a series of models have been proposed, starting from Dynamic Graph Convolutional Neural Networks (DGCNNs) (Wang et al. (2019) ), to other solutions that decouple graph inference and information diffusion, such as the Differentiable Graph Modules (DGMs) in Cosmo et al. (2020) and Kazi et al. (2022) . Note that latent graph inference may also be referred to as graph structure learning in the literature. A survey of similar methods can be found in Zhu et al. (2021) , and some additional



2.1 RELATED WORKLatent graph and topology inference is a standing problem in Geometric Deep Learning. In contrast to algorithms that work on sets and that apply a shared pointwise function such as PointNet (Qi et al. (2017)), in latent graph inference we want to learn to optimally share information between nodes in the pointcloud. Some contributions in the literature have focused on applying pre-processing steps to enhance diffusion based on an initial input graph (Topping et al. (2021); Gasteiger et al. (2019); Alon & Yahav (2021); Wu et al. (

