VECODER -VARIATIONAL EMBEDDINGS FOR COM-MUNITY DETECTION AND NODE REPRESENTATION

Abstract

In this paper, we study how to simultaneously learn two highly correlated tasks of graph analysis, i.e., community detection and node representation learning. We propose an efficient generative model called VECODER for jointly learning Variational Embeddings for Community Detection and node Representation. VECODER assumes that every node can be a member of one or more communities. The node embeddings are learned in such a way that connected nodes are not only "closer" to each other but also share similar community assignments. A joint learning framework leverages community-aware node embeddings for better community detection. We demonstrate on several graph datasets that VECODER effectively outperforms many competitive baselines on all three tasks i.e. node classification, overlapping community detection and non-overlapping community detection. We also show that VECODER is computationally efficient and has quite robust performance with varying hyperparameters.

1. INTRODUCTION

Graphs are flexible data structures that model complex relationships among entities, i.e. data points as nodes and the relations between nodes via edges. One important task in graph analysis is community detection, where the objective is to cluster nodes into multiple groups (communities) . Each community is a set of densely connected nodes. The communities can be overlapping or non-overlapping, depending on whether they share some nodes or not. Several algorithmic (Ahn et al., 2010; Derényi et al., 2005) and probabilistic approaches (Gopalan & Blei, 2013; Leskovec & Mcauley, 2012; Wang et al., 2017; Yang et al., 2013) to community detection have been proposed. Another fundamental task in graph analysis is learning the node embeddings. These embeddings can then be used for downstream tasks like graph visualization (Tang et al., 2016; Wang et al., 2016; Gao et al., 2011; Wang et al., 2017) and classification (Cao et al., 2015; Tang et al., 2015) . In the literature, these tasks are usually treated separately. Although the standard graph embedding methods capture the basic connectivity, the learning of the node embeddings is independent of community detection. For instance, a simple approach can be to get the node embeddings via DeepWalk (Perozzi et al., 2014) and get community assignments for each node by using k-means or Gaussian mixture model. Looking from the other perspective, methods like Bigclam (Yang & Leskovec (2013) ), that focus on finding the community structure in the dataset, perform poorly for node-representation tasks e.g. node classification. This motivates us to study the approaches that jointly learn community-aware node embeddings. Recently several approaches, like CNRL (Tu et al., 2018 ), ComE (Cavallari et al., 2017 ), vGraph (Sun et al. (2019) ) etc, have been proposed to learn the node embeddings and detect communities simultaneously in a unified framework. Several studies have shown that community detection is improved by incorporating the node representation in the learning process (Cao et al., 2015; Kozdoba & Mannor, 2015) . The intuition is that the global structure of graphs learned during community detection can provide useful context for node embeddings and vice versa. The joint learning methods (CNRL, ComE and vGraph) learn two embeddings for each node. One node embedding is used for the node representation task. The second node embedding is the "context" embedding of the node which aids in community detection. As CNRL and ComE are based on Skip-Gram (Mikolov et al., 2013) and DeepWalk (Perozzi et al., 2014) , they inherit "context" embedding from it for learning the neighbourhood information of the node. vGraph also requires two node embeddings for parameterizing two different distributions. In contrast, we propose learning a single community-aware node representation which is directly used for both tasks. In this way, we not only get rid of an extraneous node embedding but also reduce the computational cost. In this paper, we propose an efficient generative model called VECODER for jointly learning both community detection and node representation. The underlying intuition behind VECODER is that every node can be a member of one or more communities. However, the node embeddings should be learned in such a way that connected nodes are "closer" to each other than unconnected nodes. Moreover, connected nodes should have similar community assignments. Formally, we assume that for i-th node, the node embeddings z i are generated from a prior distribution p(z). Given z i , the community assignments c i are sampled from p(c i |z i ), which is parameterized by node and community embeddings. In order to generate an edge (i, j), we sample another node embedding z j from p(z) and respective community assignment c j from p(c j |z j ). Afterwards, the node embeddings and the respective community assignments of node pairs are fed to a decoder. The decoder ensures that embeddings of both the nodes and the communities of connected nodes share high similarity. This enables learning such node embeddings that are useful for both community detection and node representation tasks. We validate the effectiveness of our approach on several real-world graph datasets. In Sec. 4, we show empirically that VECODER is able to outperform the baseline methods including the direct competitors on all three tasks i.e. node classification, overlapping community detection and nonoverlapping community detection. Furthermore, we compare the computational cost of training different algorithms. VECODER is up to 40x more time-efficient than its competitors. We also conduct hyperparameter sensitivity analysis which demonstrates the robustness of our approach. Our main contributions are summarized below: • We propose an efficient generative model called VECODER for joint community detection and node representation learning. • We adopt a novel approach and argue that a single node embedding is sufficient for learning both the representation of the node itself and its context. • Training VECODER is extremely time-efficient in comparison to its competitors.

2. RELATED WORK

Community Detection. Early community detection algorithms are inspired from clustering algorithms (Xie et al., 2013) . For instance, spectral clustering (Tang & Liu, 2011) is applied to the graph Laplacian matrix for extracting the communities. Similarly, several matrix factorization based methods have been proposed to tackle the community detection problem. For example, Bigclam (Yang & Leskovec (2013) ) treats the problem as a non-negative matrix factorization (NMF) task. It aims to recover the node-community affiliation matrix and learns the latent factors which represent community affiliations of nodes. Another method CESNA (Yang et al. (2013) ) extends Bigclam by modelling the interaction between the network structure and the node attributes. The performance of matrix factorization methods is limited due to the capacity of the bi-linear models. Some generative models, like vGraph (Sun et al., 2019 ), Circles (Leskovec & Mcauley (2012) ) etc, have also been proposed to detect communities in a graph. Node Representation Learning. Many successful algorithms which learn node representation in an unsupervised way are based on random walk objectives (Perozzi et al., 2014; Tang et al., 2015; Grover & Leskovec, 2016; Hamilton et al., 2017) . Some known issues with random-walk based methods (e.g. DeepWalk, node2vec etc) are: (1) They sacrifice the structural information of the graph by putting over-emphasis on the proximity information (Ribeiro et al., 2017) and (2) great dependence of the performance on hyperparameters (walk-length, number of hops etc) (Perozzi et al., 2014; Grover & Leskovec, 2016 ). Recently, Gilmer et al. (2017) recently showed that graph convolutions encoder models greatly reduce the need for using the random-walk based training objectives. This is because the graph convolutions enforce that the neighboring nodes have similar representations. Some interesting GCN based approaches include graph autoencoders e.g. GAE and VGAE(Kipf & Welling (2016b) ) and DGI (Velickovic et al., 2019) . Joint community detection and node representation learning. In the literature, several attempts have been made to tackle both these tasks in a single framework. Most of these methods propose

