STRUCTURAL LANDMARKING AND INTERACTION MODELLING: ON RESOLUTION DILEMMAS IN GRAPH CLASSIFICATION

Abstract

Graph neural networks are promising architecture for learning and inference with graph-structured data. However, generating informative graph level features has long been a challenge. Current practice of graph-pooling typically summarizes a graph by squeezing it into a single vector. However, from complex systems point of view, properties of a complex system are believed to arise largely from the interaction among its components. In this paper, we analyze the intrinsic difficulty in graph classification under the unified concept of "resolution dilemmas" and propose "SLIM", an inductive neural network model for Structural Landmarking and Interaction Modelling, to remedy the information loss in graph pooling. We show that, by projecting graphs onto end-to-end optimizable, and well-aligned substructure landmarks (representatives), the resolution dilemmas can be resolved effectively, so that explicit interacting relation between component parts of a graph can be leveraged directly in explaining its complexity and predicting its property. Empirical evaluations, in comparison with state-of-the-art, demonstrate promising results of our approach on a number of benchmark datasets for graph classification.

1. INTRODUCTION

Complex systems are ubiquitous in natural and scientific disciplines, and how the relation between component parts gives rise to global behaviour of a system is a central research topic in many areas such as system biology (Camacho et al., 2018) , neural science (Kriegeskorte, 2015) , and drug and material discoveries (Stokes et al., 2020; Schmidt et al., 2019) . Recently, graph neural networks provide a promising architecture for representation learning on graphs -the structural abstraction of a complex system. State-of-the-art performances are observed in various graph mining tasks (Bronstein et al., 2017; Defferrard et al., 2016; Hamilton et al., 2017; Xu et al., 2019; Velickovic et al., 2017; Morris et al., 2019; Wu et al., 2020; Zhou et al., 2018; Zhang et al., 2020) . However, due to the non-Euclidean nature, important challenges still exist in graph classification. For example, in order to generate a fixed-dimensional representation for a graph of arbitrary size, graph pooling is typically adopted to summarize the information from each each node. In the pooled form, the whole graph is squeezed into a "super-node", in which the identities of the constituent sub-graphs and their interconnections are mixed together. Is this the best way to generate graph-level features? From a complex system's view, mixing all parts together might make it difficult for interpreting the prediction results, because properties of a complex system arise largely from the interactions among its components (Hartwell et al., 1999; Debarsy et al., 2017; Cilliers, 1998) . The choice of the "collapsing"-style graph pooling roots deeply in the lack of natural alignment among graphs that are not isomorphic. Therefore pooling sacrifices structural details for feature (dimension) compatibility. Recent years, substructure patternsfoot_0 draw considerable attention in graph mining, such as motifs (Milo et al., 2002; Alon, 2007; Wernicke, 2006; Austin R. Benson, 2016) and graphlets (Shervashidze et al., 2009) . It provides an intermediate scale for structure comparison or counting, and has been applied to node embedding (Lee et al., 2019; Ahmed et al., 2018) , deep graph kernels (Yanardag & Vishwanathan, 2015) and graph convolution (Yang et al., 2018) . However, due to combinatorial nature, only substructures of very small size (4 or 5 nodes) can be considered (Yanardag & Vishwanathan, 2015; Wernicke, 2006) , greatly limiting the coverage of structural variations; also, handling substructures as discrete objects makes it difficult to compensate for their similarities, and so the risk of overfitting may rise in supervised learning scenarios (Yanardag & Vishwanathan, 2015) . We view these intrinsic difficulties as related to resolution dilemmas in graph-structured data processing. Resolution is the scale at which measurements can be made and/or information processing algorithms are conducted, and here we will discuss two types of resolution and related dilemmas: the spatial resolution (dilemma) and the structural resolution (dilemma). Spatial resolution relates to the geometrical scale of the "component" that can be identified from the final representation of a graph (based on which the prediction is performed). In GNN, since graph pooling compresses the whole graph into a single vector, node and edge identities are mixed together and the spatial resolution drops to the lowest. We call this vanishing spatial resolution (dilemma). Structural resolution is the fineness level in differentiating between two substructures. Currently practice of exact matching makes it computationally intractable to handle the exponentially many sub-graph instances, and the risk of overfitting may also rise as observed in deep graph kernels (Yanardag & Vishwanathan, 2015) and dictionary learning (Marsousi et al., 2014) . We will call this over-delicate substructure profiling an exploding structural resolution (dilemma). In fact, these two resolution dilemmas are not isolated. They have a causal relation and the origin is the way we perform identification and comparison of discrete substructures (more in Section 2.3). Our contribution. Inspired by the well-studied science of complex systems, and in particular the importance of the interacting relation between component parts of a system, we propose a simple neural architecture called "Structural Landmarking and Interaction Modelling" -or SLIM. It allows graphs to be projected onto a set of end-to-end optimizable, well-aligned structural landmarks, so that identities of graph substructures and their interactions can be captured explicitly to explain the complexity and improve graph classification. We show that, by resolving the two resolution dilemmas, and subsequently respecting the structural organization of complex systems, SLIM can be empirically very promising and offers new possibilities in graph representation learning. In the rest of the paper, we will first define the resolution dilemmas of graph classification in Section 2, together with the discussion of related works. We then cover in Section 3, 4 and 5 the design, analysis, and performance of SLIM, respectively. Finally, the last section concludes the paper.

2. RESOLUTION DILEMMAS IN GRAPH CLASSIFICATION

A complex system is often composed of many parts interacting with each other in a non-trivial way. Since graphs are structural abstraction of complex systems, accurate graph classification depends on how global properties of a system relate to its structure. It is believed that property of a complex system arises from interactions among its components (Debarsy et al., 2017; Cilliers, 1998) . Consequently, accurate interaction modelling should benefit prediction. However, it is non-trivial due to resolution dilemmas, as described in the following subsections.

2.1. SPATIAL RESOLUTION DIMINISHES IN GRAPH POOLING

Graph neural networks (GNN) for graph classification typically involves two key blocks, graph convolution and graph pooling (Kipf & Welling, 2017; Hamilton et al., 2017; Xu et al., 2019) , at significantly different spatial resolutions. The goal of convolution is to pass information among neighboring nodes in the general form of Hamilton et al., 2017; Xu et al., 2019) . Here, the spatial resolution is controlled by the number of convolution layers: more layers capture lager substructures/sub-trees and can lead to improved discriminative power (Xu et al., 2019) . In other words, the spatial resolution in the convolution stage can be controlled easily, and multiple resolutions may be even combined together via CONCATENATE function (Hamilton et al., 2017; Xu et al., 2019) for improved modelling. h v = AGGREGATE ({h u , u ∈ N v }) , where N v is the neighbors of v ( The goal of graph pooling is to generate compact, graph-or subgraph-level representations that are compatible across graphs. Due to the lack of natural alignment between non-isomorphic graphs, graph pooling typically "squeezes" a graph G into a single vector (or "super-node") in the form of h G = READOUT ({f (h v ), ∀v ∈ V}) ,



Informally, substructure in this paper means a connected subgraph and will be used interchargeably with it.

