DEEP GRAPH-LEVEL CLUSTERING USING PSEUDO-LABEL-GUIDED MUTUAL INFORMATION MAXIMIZA-TION NETWORK

Abstract

In this work, we study the problem of partitioning a set of graphs into different groups such that the graphs in the same group are similar while the graphs in different groups are dissimilar. This problem was rarely studied previously, although there have been a lot of work on node clustering and graph classification. The problem is challenging because it is difficult to measure the similarity or distance between graphs. One feasible approach is using graph kernels to compute a similarity matrix for the graphs and then performing spectral clustering, but the effectiveness of existing graph kernels in measuring the similarity between graphs is very limited. To solve the problem, we propose a novel method called Deep Graph-Level Clustering (DGLC). DGLC utilizes a graph isomorphism network to learn graph-level representations by maximizing the mutual information between the representations of entire graphs and substructures, under the regularization of a clustering module that ensures discriminative representations via pseudo labels. DGLC achieves graph-level representation learning and graph-level clustering in an end-to-end manner. The experimental results on six benchmark datasets of graphs show that our DGLC has state-of-the-art performance in comparison to many baselines.

1. INTRODUCTION

Graph-structured data widely exist in real-world scenarios, such as social networks (Newman, 2006) and molecular analysis (Gilmer et al., 2017) . Compared to other data formats, graph data explicitly contain connections between data through the attributes of nodes and edges, which can provide rich structural information for many applications. In recent years, machine learning on graph-structured data gains more and more attention. Many supervised and unsupervised learning methods have been proposed for graph-structured data in various applications. The machine learning problems of graph-structured data can be organized into two categories: nodelevel learning and graph-level learning. In node-level learning, the samples are the nodes in a single graph. Node-level learning mainly includes node classification (Li et al., 2017; Wu et al., 2021; Xu et al., 2021) and node clustering (Wang et al., 2017; Pan & Kang, 2021; Lin et al., 2021) . Classical node classification methods are often based on graph embedding (Yan et al., 2006; Cai et al., 2018) and graph regularization (Subramanya & Bilmes, 2009; Bhagat et al., 2011) , while recent advances are based on graph neural networks (GNN) (Kipf & Welling, 2017; Xu et al., 2019; Wu et al., 2020) . Owing to the success of GNN in nodes classification, a few researchers have proposed GNN-based methods for nodes clustering (Wang et al., 2019; Bo et al., 2020; Zhu & Koniusz, 2021) . Different from node-level learning, in graph-level learning, the samples are a set of graphs that can be organized into different groups. Classical methods for graph-level classification are often based on graph kernels (Vishwanathan et al., 2010; Yanardag & Vishwanathan, 2015) while recent advances are based on GNN (Wu et al., 2020; Rong et al., 2020) . Researchers generally utilize various types of GNN, e.g., graph convolutional networks (GCNs) (Kipf & Welling, 2017) and graph isomorphism network (GIN) (Xu et al., 2019) to learn graph-level representations by aggregating inherent node information and structural neighbor information in graphs, then they train a classifier based on the learned graph-level representations (Zhang et al., 2018; Sun et al., 2020; Wang et al., 2021; Doshi & Chepuri, 2022) . Nevertheless, collecting large amounts of labels for graph-level classification is costly in real-world, and the clustering on graph-level data is much more difficult than that on nodes and still remains an open issue. It thereby shows the importance of exploring graph-level clustering, namely partitioning a set of graphs into different groups such that the graphs in the same group are similar while the graphs in different groups are dissimilar. Previous research on graph-level clustering is very limited. The major reason is that it is difficult to represent graphs as feature vectors or quantify the similarity between graphs in an unsupervised manner. An intuitive approach to graph-level clustering is to perform spectral clustering (Ng et al., 2001) over the similarity matrix produced by a graph kernels (Kondor & Pan, 2016; Du et al., 2019; Togninalli et al., 2019) on graphs. Although there have been a few graph kernels such as random walk kernel (Gärtner et al., 2003) and Weisfeiler-Lehman kernel (Shervashidze et al., 2011) , most of them rely on manual design that fails to provide desirable generalization capability for various types of graphs and produce satisfactory similarity matrices for spectral clustering, which will be demonstrated in Section 4.3. Another solution comes with the encouraging development of GNNs. Some latest works such as GCNs (Kipf & Welling, 2017) and GIN (Xu et al., 2019) have been proven to be effective in learning node/graph-level representations for various downstream tasks, e.g., node clustering (Wang et al., 2017; Bo et al., 2020; Liu et al., 2022) and graph classification (Sun et al., 2020; Sato et al., 2021; You et al., 2021) -thanks to the powerful generalization and representation learning capability of deep neural networks. Therefore, it may be possible to achieve graph-level clustering by performing classical clustering algorithms such as k-means (Hartigan & Wong, 1979) and spectral clustering over the graph-level representations produced by various unsupervised graph representation learning methods (Grover & Leskovec, 2016; Narayanan et al., 2017; Adhikari et al., 2018; Sun et al., 2020) . Although the afore-mentioned GNN-based unsupervised graph-level representation learning methods have shown promising performance in terms of some down-stream tasks such as node clustering and graph classification, they do not guarantee to generate effective features for the clustering tasks on entire graphs. In contrast, the graph-level clustering may benefit from an end-to-end framework that can learn clustering-oriented features in the graph-level representation learning. We summarize our motivation here: 1) Graph-level clustering is an important problem but it is rarely studied, though there have been a lot of works on graph-level classification and node-level clustering. 2) The performance of graph-kernels followed by spectral clustering and two-stage methods (deep graphlevel feature learning followed by k-means or spectral clustering) haven't been well explored. 3) An end-to-end deep learning based graph-level clustering method is expected to outperform graph kernels and the two-stage methods because the feature learning is clustering-oriented. Therefore, we propose a novel graph clustering method called deep graph-level clustering (DGLC) in this paper. The proposed method is a fully unsupervised framework and yields the clustering-oriented graphlevel representations via jointly optimizing two objectives: representation learning and clustering. The main contributions of this paper are summarized as follows. • We investigate the effectiveness of various graph kernels as well as unsupervised graph representation learning methods in the problem of graph-level clustering. • We propose an end-to-end graph-level clustering method. In the method, the clustering objective can guide the representation learning for entire graphs, which is demonstrated to be much more effective than those two-stage models in this paper. • We conduct extensive comparative experiments of graph-level clustering on six benchmark datasets. Our method is compared with five graph kernel methods and four cutting-edge GNN representation learning methods, under the evaluation of three quantitative metrics and one qualitative (visualization) metric. Our method has state-of-the-art performance.

2. PRELIMINARIES

The notations used in this paper are shown in Table 1 . In the next two subsections, we briefly introduce graph kernels and GNN based graph-level representation learning methods. We will also illustrate how to apply them to graph-level clustering.

