TOPOTER: UNSUPERVISED LEARNING OF TOPOLOGY TRANSFORMATION EQUIVARIANT REPRESENTATIONS

Abstract

We present the Topology Transformation Equivariant Representation (TopoTER) learning, a general paradigm of unsupervised learning of node representations of graph data for the wide applicability to Graph Convolutional Neural Networks (GCNNs). We formalize the TopoTER from an information-theoretic perspective, by maximizing the mutual information between topology transformations and node representations before and after the transformations. We derive that maximizing such mutual information can be relaxed to minimizing the cross entropy between the applied topology transformation and its estimation from node representations. In particular, we seek to sample a subset of node pairs from the original graph and flip the edge connectivity between each pair to transform the graph topology. Then, we self-train a representation encoder to learn node representations by reconstructing the topology transformations from the feature representations of the original and transformed graphs. In experiments, we apply the TopoTER to the downstream node and graph classification tasks, and results show that the TopoTER outperforms the state-of-the-art unsupervised approaches.

1. INTRODUCTION

Graphs provide a natural and efficient representation for non-Euclidean data, such as brain networks, social networks, citation networks, and 3D point clouds. Graph Convolutional Neural Networks (GCNNs) (Bronstein et al., 2017) have been proposed to generalize the CNNs to learn representations from non-Euclidean data, which has made significant advances in various applications such as node classification (Kipf & Welling, 2017; Veličković et al., 2018; Xu et al., 2019a) and graph classification (Xu et al., 2019b) . However, most existing GCNNs are trained in a supervised fashion, requiring a large amount of labeled data for network training. This limits the applications of the GCNNs since it is often costly to collect adequately labeled data, especially on large-scale graphs. Hence, this motivates the proposed research to learn graph feature representations in an unsupervised fashion, which enables the discovery of intrinsic graph structures and thus adapts to various downstream tasks. Auto-Encoders (AEs) and Generative Adversarial Networks (GANs) are two most representative unsupervised learning methods. Based on the AEs and GANs, many approaches have sought to learn transformation equivariant representations (TERs) to further improve the quality of unsupervised representation learning. It assumes that the learned representations equivarying to transformations are able to encode the intrinsic structures of data such that the transformations can be reconstructed from the representations before and after transformations (Qi et al., 2019b) . Learning TERs traces back to Hinton's seminal work on learning transformation capsules (Hinton et al., 2011) , and embodies a variety of methods developed for Euclidean data (Kivinen & Williams, 2011; Sohn & Lee, 2012; Schmidt & Roth, 2012; Skibbe, 2013; Lenc & Vedaldi, 2015; Gens & Domingos, 2014; Dieleman et al., 2015; 2016; Zhang et al., 2019; Qi et al., 2019a ). Further, Gao et al. (2020) extend transformation equivariant representation learning to non-Euclidean domain, which formalizes Graph Transformation Equivariant Representation (GraphTER) learning by auto-encoding nodewise transformations in an unsupervised fashion. Nevertheless, only transformations on node features are explored, while the underlying graph may vary implicitly. The graph topology has not been fully explored yet, which however is crucial in unsupervised graph representation learning. To this end, we propose the Topology Transformation Equivariant Representation (TopoTER) learning to infer unsupervised graph feature representations by estimating topology transformations. In-stead of transforming node features as in the GraphTER, the proposed TopoTER studies the transformation equivariant representation learning by transforming the graph topology, i.e., adding or removing edges to perturb the graph structure. Then the same input signals are attached to the resultant graph topologies, resulting in different graph representations. This provides an insight into how the same input signals associated with different graph topologies would lead to equivariant representations enabling the fusion of node feature and graph topology in GCNNs. Formally, we propose the TopoTER from an information-theoretic perspective, aiming to maximize the mutual information between topology transformations and feature representations with respect to the original and transformed graphs. We derive that maximizing such mutual information can be relaxed to the cross entropy minimization between the applied topology transformations and the estimation from the learned representations of graph data under the topological transformations. Specifically, given an input graph and its associated node features, we first sample a subset of node pairs from the graph and flip the edge connectivity between each pair at a perturbation rate, leading to a transformed graph with attached node features. Then, we design a graph-convolutional auto-encoder architecture, where the encoder learns the node-wise representations over the original and transformed graphs respectively, and the decoder predicts the topology transformations of edge connectivity from both representations by minimizing the cross entropy between the applied and estimated transformations. Experimental results demonstrate that the proposed TopoTER model outperforms the state-of-the-art unsupervised models, and even achieves comparable results to the (semi-)supervised approaches in node classification and graph classification tasks at times. Our main contributions are summarized as follows. • We propose the Topology Transformation Equivariant Representation (TopoTER) learning to infer expressive node feature representations in an unsupervised fashion, which can characterize the intrinsic structures of graphs and the associated features by exploring the graph transformations of connectivity topology. • We formulate the TopoTER from an information-theoretic perspective, by maximizing the mutual information between feature representations and topology transformations, which can be relaxed to the cross entropy minimization between the applied transformations and the prediction in an end-to-end graph-convolutional auto-encoder architecture. • Experiments demonstrate that the proposed TopoTER model outperforms the state-of-the-art unsupervised methods in both node classification and graph classification.

2. RELATED WORK

Graph Auto-Encoders. Graph Auto-Encoders (GAEs) are the most representative unsupervised methods. GAEs encode graph data into feature space via an encoder and reconstruct the input graph data from the encoded feature representations via a decoder. GAEs are often used to learn network embeddings and graph generative distributions (Wu et al., 2020) . For network embedding learning, GAEs learn the feature representations of each node by reconstructing graph structural information, such as the graph adjacency matrix (Kipf & Welling, 2016) and the positive pointwise mutual information (PPMI) matrix (Cao et al., 2016; Wang et al., 2016) . For graph generation, some methods generate nodes and edges of a graph alternately (You et al., 2018) , while other methods output an entire graph (Simonovsky & Komodakis, 2018; Ma et al., 2018; De Cao & Kipf, 2018) .

Graph Contrastive

Learning. An important paradigm called contrastive learning aims to train an encoder to be contrastive between the representations of positive samples and negative samples. 



Recent contrastive learning frameworks can be divided into two categories (Liu et al., 2020): context-instance contrast and context-context contrast. Context-instance contrast focuses on modeling the relationships between the local feature of a sample and its global context representation. Deep InfoMax (DIM) (Hjelm et al., 2018) first proposes to maximize the mutual information between a local patch and its global context through a contrastive learning task. Deep Graph InfoMax (DGI) (Velickovic et al., 2019) proposes to learn node-level feature representation by extending DIM to graph-structured data, while InfoGraph (Sun et al., 2020a) aims to use mutual information maximization for unsupervised representation learning on entire graphs. Peng et al. (2020) propose a Graphical Mutual Information (GMI) approach to maximize the mutual information of both features and edges between inputs and outputs. In contrast to context-instance methods, contextcontext contrast studies the relationships between the global representations of different samples. M3S (Sun et al., 2020b) adopts a self-supervised pre-training paradigm as in DeepCluster (Caron et al., 2018) for better semi-supervised prediction in GCNNs. Graph Contrastive Coding (GCC)

