STARGRAPH: KNOWLEDGE REPRESENTATION LEARNING BASED ON INCOMPLETE TWO-HOP SUBGRAPH

Abstract

Conventional representation learning algorithms for knowledge graphs (KG) map each entity to a unique embedding vector, ignoring the rich information contained in the neighborhood. We propose a method named StarGraph, which gives a novel way to utilize the neighborhood information for large-scale knowledge graphs to obtain entity representations. An incomplete two-hop neighborhood subgraph for each target node is at first generated, then processed by a modified self-attention network to obtain the entity representation, which is used to replace the entity embedding in conventional methods. We achieved SOTA performance on ogbl-wikikg2 and got competitive results on fb15k-237. The experimental results proves that StarGraph is efficient in parameters, and the improvement made on ogbl-wikikg2 demonstrates its great effectiveness of representation learning on large-scale knowledge graphs.

1. INTRODUCTION

A Knowledge Graph (KG) is a directed graph with real-world entities as nodes and relationships between entities as edges. In this graph, each directed edge together with its head and tail entities forms a triple (head entity, relation, tail entity), indicating that the head and tail entities are connected by a relation. Knowledge graph embedding (KGE), also known as knowledge representation learning (KRL), aims to embed entities and relations into low-dimensional continuous vector spaces to characterize their latent semantic features. A scoring function is defined to measure the plausibility for triples in such spaces, then the embeddings of entities and relations are learned by maximizing the total plausibility of the observed triples. These learned embeddings can be used to implement various tasks such as knowledge graph completion (Bordes et al., 2013; Wang et al., 2014) , relationship extraction (Riedel et al., 2013) , entity classification (Nickel et al., 2011) , etc. The plausibility of each triple is calculated on the embeddings of the entities and relations in it, and the embeddings are directly taken out from the embedding tables. Such a shallow lookup decides that those models are inherently transductive. Moreover, the rich contextual information contained in the neighboring triples is not taken into account. Compared with shallow embedding models, methods that are able to encode neighborhood information, usually perform much better across various KG datasets (Zhang & Chen, 2018; Zhang et al., 2021; Wang et al., 2019) . Any generic graph neural networks could be employed as the encoder. However, there is a problem adopting these methods to large-scale knowledge graphs, for previous work (Nathani et al., 2019; Wang et al., 2020) takes the multi-hop subgraph of the node as input. Due to the large number of nodes and edges, multi-hop subgraphs in large-scale graphs can easily exceed the size limitation, and the subgraphs generation and network calculations can both be very time-consuming. The neighborhood surely contains information for the target node, therefore can be used for learning its representation. In order to adopt neighborhood neural encoders in large-scale KG, an intuitive idea is to utilize partial neighborhood information instead of the complete multi-hop subgraph. 1

