DUAL GRAPH COMPLEMENTARY NETWORK

Abstract

As a powerful representation learning method on graph data, graph neural networks (GNNs) have shown great popularity in tackling graph analytic problems. Although many attempts have been made in literatures to find strategies about extracting better embedding of the target nodes, few of them consider this issue from a comprehensive perspective. Most of current GNNs usually employ some single method which can commendably extract a certain kind of feature but some equally important features are often ignored. In this paper, we develop a novel dual graph complementary network (DGCN) to learn representation complementarily. We use two different branches, and inputs of the two branches are the same, which are composed of structure and feature information. At the same time, there is also a complementary relationship between the two branches. Beyond that, our extensive experiments show that DGCN outperforms state-of-the-art methods on five public benchmark datasets.

1. INTRODUCTION

Although many attempts have been made in literatures to find a better strategy to learn the target node representation, the feature extraction capabilities of most methods are still far from optimal, especially when only a small amount of data is labeled. However, in fact, compared with the expensive and laborious acquisition of labeled data, unlabeled data is much easier to obtain. Therefore, how to learn more useful representations with limited label information is the key direct of representation learning study. Methods of this issue, commonly referred to as semi-supervised learning, which essentially believe that the similar points have similar outputs. Thus, it can properly utilize the consistency of data to make full use of the rich information of unsupervised data. In the real world, it is common that we have data with specific topological structures which usually called graph data. The graph structure is usually expressed as the connection between nodes. By aggregating the features of neighborhood and performing appropriate linear transformation, graph neural networks (GNNs) can convert graph data into a low-dimensional, compact, and continuous feature space. Nevertheless, most of them only care about a single aggregation strategy, which is counter intuitive: for example, as far as social networks are concerned, the relationship between people is very complex, while, most of the traditional GNNs only consider the single connection between nodes and ignore other implicit information. In this paper, our work focuses on learning node representations by GNNs in a semi-supervised way. Despite there are already many graph-based semi-supervised learning methods (Kipf & Welling, 2016; Yang et al., 2016; Khan & Blumenstock, 2019) , most of them can only find a single relationship between nodes. As a result, some information in unsupervised data is usually ignored. To overcome this problem, we develop a novel dual graph complementary network (DGCN) to extract information from both feature and topology spaces. An intuition of our method is to learn based on disagreement: network performance is largely related to the quality of the graph, which usually emphasizes the relevance of an attribute of instances. So, since we don't know what attributes are most important, we consider both of them in the model design. Compared with the traditional GNN-based methods, we perform two different aggregate strategies which emphasize different attributes in each branch, one from the perspective of node feature, and the other from the topological structure. Then, to further utilize implicit information, we employ two networks with different structures to extract embedding from input feature. By doing so, nodes' information can be propagated in different ways. Then, the supervised loss ℓ sup and diversity constraint ℓ div are used to guide the training. We use two different branches to extract common information in topology and feature spaces. By utilizing disagreements between the two branches, model can gain information that may be ignored by single branch. To prove the effectiveness of our method, we conducted experiments on five public benchmark datasets. The contributions of our work are summarized as follows: • We propose a novel dual graph complementary network (DGCN) to fuse complementary information, which utilizes different graphs to aggregate nodes that are similar in certain attributes in a complementary way. • By comparing with algorithms that use non-single graphs, it proves that our complementary architecture can extract richer information • Through extensive evaluation on multiple datasets, we demonstrate DGCN effectiveness over state-of-the-art baselines.

2.1. SEMI-SUPERVISED LEARNING

Semi-supervised learning is usually aimed at the case of insufficient data labels. X ∈ R n×d is the feature of input nodes. Y = [y ij ] ∈ R n×k is the label matrix, where k is the class number. y ij means that the i-th node belongs to the j-th class. Then split data points into labeled and unlabeled points. Accordingly, x L and x U express a feature of labeled and unlabeled instance, respectively. Moreover, the ground-truth label of the label nodes is available only. The main objective of semi-supervised learning is to extract supervised information from labeled dataset whilst adequately utilizing data distribution information contained in X. There are four categories of semi-supervised learning algorithms: 1. Self-training semi-supervised learning (Lee, 2013): It utilizes high-confidence pseudo labels to expand label set. Ideally, it can continuously improve network performance, but is usually limited by the quality of pseudo labels. 2. Graph-based semi-supervised learning: It propagates information between instances according to edges in graph. It's an inductive learning method, of which the performance mainly depends on the aggregation algorithm. 3. Low-density separation methods (Joachims, 1999): They assume that the decision hyperplane is consistent with the data distribution, so so it should pass through the sparse region of the data. 4. Pretrain semi-supervised learning: such as autoencoder (Vincent et al., 2008; Rifai et al., 2011) , trains the model based on reconstruction error and then fine tune it using labeled data. However, semi-supervised learning tasks prefer to obtain information related to data distribution rather than all information of samples. In this paper, we mainly focus on the graph-based semisupervised learning.

2.2. GRAPH-BASED SEMI-SUPERVISED LEARNING

In addition to features, graph-based semi-supervised learning methods (Kipf & Welling, 2016) represent the topological edge connection between different instances. For many datasets, graph is given as a feature. If the features of the dataset do not contain the relationships between different samples, a graph can also be constructed by measuring the similarity between the features of the instances (Zhu et al., 2003) . Actually, the graph is a measure of whether the instances are closely

