STARGRAPH: KNOWLEDGE REPRESENTATION LEARNING BASED ON INCOMPLETE TWO-HOP SUBGRAPH

Abstract

Conventional representation learning algorithms for knowledge graphs (KG) map each entity to a unique embedding vector, ignoring the rich information contained in the neighborhood. We propose a method named StarGraph, which gives a novel way to utilize the neighborhood information for large-scale knowledge graphs to obtain entity representations. An incomplete two-hop neighborhood subgraph for each target node is at first generated, then processed by a modified self-attention network to obtain the entity representation, which is used to replace the entity embedding in conventional methods. We achieved SOTA performance on ogbl-wikikg2 and got competitive results on fb15k-237. The experimental results proves that StarGraph is efficient in parameters, and the improvement made on ogbl-wikikg2 demonstrates its great effectiveness of representation learning on large-scale knowledge graphs.

1. INTRODUCTION

A Knowledge Graph (KG) is a directed graph with real-world entities as nodes and relationships between entities as edges. In this graph, each directed edge together with its head and tail entities forms a triple (head entity, relation, tail entity), indicating that the head and tail entities are connected by a relation. Knowledge graph embedding (KGE), also known as knowledge representation learning (KRL), aims to embed entities and relations into low-dimensional continuous vector spaces to characterize their latent semantic features. A scoring function is defined to measure the plausibility for triples in such spaces, then the embeddings of entities and relations are learned by maximizing the total plausibility of the observed triples. These learned embeddings can be used to implement various tasks such as knowledge graph completion (Bordes et al., 2013; Wang et al., 2014) , relationship extraction (Riedel et al., 2013) , entity classification (Nickel et al., 2011) , etc. The plausibility of each triple is calculated on the embeddings of the entities and relations in it, and the embeddings are directly taken out from the embedding tables. Such a shallow lookup decides that those models are inherently transductive. Moreover, the rich contextual information contained in the neighboring triples is not taken into account. Compared with shallow embedding models, methods that are able to encode neighborhood information, usually perform much better across various KG datasets (Zhang & Chen, 2018; Zhang et al., 2021; Wang et al., 2019) . Any generic graph neural networks could be employed as the encoder. However, there is a problem adopting these methods to large-scale knowledge graphs, for previous work (Nathani et al., 2019; Wang et al., 2020) takes the multi-hop subgraph of the node as input. Due to the large number of nodes and edges, multi-hop subgraphs in large-scale graphs can easily exceed the size limitation, and the subgraphs generation and network calculations can both be very time-consuming. The neighborhood surely contains information for the target node, therefore can be used for learning its representation. In order to adopt neighborhood neural encoders in large-scale KG, an intuitive idea is to utilize partial neighborhood information instead of the complete multi-hop subgraph. In this paper, we propose to learn the knowledge representation for each target node based on its incomplete 2-hop neighborhood subgraph. Neighbors out of reach within two hops are not as closely related to the target node according to us, so are not taken into consideration. An example of a complete 2-hop subgraph is given in Figure 1 (a), where we can see, even in such a small knowledge graph, the 2-hop subgraph comprises quite a few nodes and edges and seems to contain a lot of redundant information. It is more efficient to construct a proper incomplete subgraph with a few nodes and edges. We got inspiration from the anchor-based strategy (Galkin et al., 2022) , which selects a small fraction of all the graph nodes as anchors and learn embeddings only for anchors instead of all nodes. In our work, we sample anchors from the 2-hop neighborhood, along with the edges to reach each anchor, to construct the incomplete 2-hop subgraph, which is illustrated in Figure 1 (b). The incomplete subgraph is not restricted to contain only anchors, we can also sample some general nodes into the subgraph as supplementary information. In order to reasonably model the subgraph structure and enable sufficient interaction of node embeddings, we adopt a self-attention network to extract the neighborhood information. Taking the characteristics of knowledge graphs into consideration, we modify the attention module to be more efficient and propose a novel way to embed the edges in a graph. Comparing the subgraphs in Figure 1 (a) and (b), the trimmed subgraph is much more efficient and is likely to be more effective to describe the target node, especially for large-scale knowledge graphs. And this is demonstrated by our experimental results. The way a node being represented by the incomplete subgraph is like how to locate a star in the sky, which is to use a few bright stars (anchors) to indicate the location. For the incomplete subgraph in the entire graph looks like a constellation among the stars, we call the proposed method StarGraph.

2. RELATED WORK

Distance-based models consist a major branch of knowledge graph embedding methods. TransE (Bordes et al., 2013) embeds relations and entities into the same vector space, and the relation embedding is interpreted as a translation from the embeddings of the head entity to the tail entity. RotatE (Sun et al., 2018) uses the rotations of vectors to explain various relations, where the inverse relations can be modeled by complex embeddings. PairRE (Chao et al., 2021) propose to learn two embeddings for each relation, respectively used to map head and tail entity embeddings into the corresponding relation space. TripleRE (Yu et al., 2021) learns another embedding for each relation on the basis of PairRE, and the extra relation embedding is used as a translation vector between the mapped entity embeddings of head and tail.



Figure 1: Illustration of a subgraph generated by StarGraph. Dots and lines represent nodes and edges in the graph, respectively, with larger dots indicating anchors. The color red indicates the example target node and the sampled subgraph.

