

Abstract

Graph convolutional networks (GCNs) enable end-to-end learning on graph structured data. However, many works begin by assuming a given graph structure. As the ideal graph structure is often unknown, this limits applicability. To address this, we present a novel end-to-end differentiable graph-generator which builds the graph topology on the fly. Our module can be readily integrated into existing pipelines involving graph convolution operations, replacing the predetermined or existing adjacency matrix with one that is learned, and optimised, as part of the general objective. As such it is applicable to any GCN. We show that integrating our module into both node classification and trajectory prediction pipelines improves accuracy across a range of datasets and backbones.

1. I N T R O D U C T I O N

The success of Graph Neural Networks (GNNs) (Duvenaud et al., 2015; Bronstein et al., 2017; Monti et al., 2017) , has led to a surge in the use of graph-based representation learning. GNNs provide an efficient framework to learn from graph-structured data, making them widely applicable in any domain where data can be represented as a relation or interaction system. They have been successfully applied in a wide range of tasks including particle physics (Choma et al., 2018) , protein science (Gainza et al., 2020) and many others (Monti et al., 2019 ), (Stokes et al., 2020) . In a GNN, each node iteratively updates its state by interacting with its neighbors, typically through message passing. However, a fundamental limitation of such architectures is the assumption that the underlying graph is provided. While node or edge features may be updated during message passing, the graph topology remains fixed, and its choice may be suboptimal for various reasons. For instance, when classifying nodes on a citation network, an edge connecting nodes of different classes can diminish classification accuracy. These edges can degrade performance by causing irrelevant information to be propagated across the graph. When no graph is explicitly provided, one common practice is to generate a k-nearest neighbor (k-NN) graph. In such cases, k is a hyperparameter and tuned to find the model with the best performance. For many applications, fixing k is overly restrictive as the optimal choice of k may vary for each node in the graph. While there has been an emergence of approaches which learn the graph structure for use in downstream GNNs (Zheng et al., 2020; Kazi et al., 2020; Kipf et al., 2018) , all of them treat the node degree k as a fixed hyperparameter. We propose a general differentiable graph-generator (DGG) module for learning graph topology with or without an initial edge structure. This module can be placed within any graph convolutional network, and jointly optimized with the rest of the network's parameters, learning topologies which favor the downstream task without hyperparameter selection or indeed any additional training signal. The primary contributions of this paper are as follows: 1. We propose a novel, differentiable graph-generator (DGG) module which jointly optimizes both the neighbourhood size, and the edges that should belong to each neighbourhood. Note that existing approaches (Zheng et al., 2020; Kipf et al., 2018; Kazi et al., 2020) do not allow for learnable neighbourhood sizes. 2. Our DGG module is directly integrable into any pipeline involving graph convolutions, where either the given adjacency matrix is noisy, or is not explicitly provided and must be determined heuristically. In both cases, our DGG generates the adjacency matrix as part of the GNN training and can be trained end-to-end to optimize performance on the downstream task. Should a good graph structure be known, the generated adjacency matrix can be learned to remain close to it while optimizing performance.

