SIGNED GRAPH DIFFUSION NETWORK

Abstract

Given a signed social graph, how can we learn appropriate node representations to infer the signs of missing edges? Signed social graphs have received considerable attention to model trust relationships. Learning node representations is crucial to effectively analyze graph data, and various techniques such as network embedding and graph convolutional network (GCN) have been proposed for learning signed graphs. However, traditional network embedding methods are not end-to-end for a specific task such as link sign prediction, and GCN-based methods suffer from a performance degradation problem when their depth increases. In this paper, we propose SIGNED GRAPH DIFFUSION NETWORK (SGDNET), a novel graph neural network that achieves end-to-end node representation learning for link sign prediction in signed social graphs. We propose a random walk technique specially designed for signed graphs so that SGDNET effectively diffuses hidden node features. Through extensive experiments, we demonstrate that SGDNET outperforms state-of-the-art models in terms of link sign prediction accuracy.

1. INTRODUCTION

Given a signed social graph, how can we learn appropriate node representations to infer the signs of missing edges? Signed social graphs model trust relationships between people with positive (trust) and negative (distrust) edges. Many online social services such as Epinions (Guha et al., 2004) and Slashdot (Kunegis et al., 2009) that allow users to express their opinions are naturally represented as signed social graphs. Such graphs have attracted considerable attention for diverse applications including link sign prediction (Leskovec et al., 2010a; Kumar et al., 2016) , node ranking (Jung et al., 2016; Li et al., 2019b ), community analysis (Yang et al., 2007; Chu et al., 2016 ), graph generation (Derr et al., 2018a; Jung et al., 2020) , and anomaly detection (Kumar et al., 2014) . Node representation learning is a fundamental building block for analyzing graph data, and many researchers have put tremendous efforts into developing effective models for unsigned graphs. Graph convolutional networks (GCN) and their variants (Kipf & Welling, 2017; Velickovic et al., 2018) have spurred great attention in machine learning community, and recent works (Klicpera et al., 2019; Li et al., 2019a) have demonstrated stunning progress by handling the performance degradation caused by over-smoothing (Li et al., 2018;  Oono & Suzuki, 2020) (i.e., node representations become indistinguishable as the number of propagation increases) or the vanishing gradient problem (Li et al., 2019a) in the first generation of GCN models. However, all of these models have a limited performance on node representation learning in signed graphs since they only consider unsigned edges under the homophily assumption (Kipf & Welling, 2017) . Many studies have been recently conducted to consider such signed edges, and they are categorized into network embedding and GCN-based models. Network embedding (Kim et al., 2018; Xu et al., 2019b) learns the representations of nodes by optimizing an unsupervised loss that primarily aims to locate two nodes' embeddings closely (or far) if they are positively (or negatively) connected. However, they are not trained jointly with a specific task in an end-to-end manner, i.e., latent features and the task are trained separately. Thus, their performance is limited unless each of them is tuned delicately. GCN-based models (Derr et al., 2018b; Li et al., 2020) have extended the graph convolutions to signed graphs using balance theory (Holland & Leinhardt, 1971 ) in order to properly propagate node features on signed edges. However, these models are directly extended from existing GCNs without consideration of the over-smoothing problem that degrades their performance. This problem hinders them from exploiting more information from multi-hop neighbors for learning node features in signed graphs.

