EM-RBR: A REINFORCED FRAMEWORK FOR KNOWL-EDGE GRAPH COMPLETION FROM REASONING PER-SPECTIVE

Abstract

Knowledge graph completion aims to predict the missing links among the knowledge graph (KG),i.e., predicting the possibility that a certain triple belongs to the knowledge graph. Most mainstream embedding methods focus on fact triplets contained in the given KG, however, ignoring the rich background information provided by logic rules driven from knowledge base implicitly. Limited to the modeling of algebraic space, contradictory in expressing certain relational patterns usually exists in the embedding models. Therefore, the representation of the knowledge graph is incomplete and inaccurate. To solve this problem, in this paper, we propose a general framework, named EM-RBR(embedding and rulebased reasoning), capable of combining the advantages of reasoning based on rules and the state-of-the-art models of embedding. EM-RBR aims to utilize relational background knowledge contained in rules to conduct multi-relation reasoning link prediction. In this way, we can find the most reasonable explanation for a given triplet to obtain higher prediction accuracy. In experiments, we demonstrate that EM-RBR achieves better performance compared with previous models on FB15k, WN18 and our new dataset FB15k-R, especially the new dataset where our model perform futher better than those state-of-the-arts. We make the implementation of EM-RBR available at https://github.com/1173710224/ link-prediction-with-rule-based-reasoning.

1. INTRODUCTION

Knowledge graph (KG) has the ability to convey knowledge about the world and express the knowledge in a structured representation. The rich structured information provided by knowledge graphs has become extremely useful resources for many Artificial Intelligence related applications like query expansion (Graupmann et al., 2005) , word sense disambiguation (Wasserman Pritsker et al., 2015) , information extraction (Hoffmann et al., 2011) , etc. A typical knowledge representation in KG is multi-relational data, stored in RDF format, e.g. (Paris, Capital-Of, France). However, due to the discrete nature of the logic facts (Wang & Cohen, 2016), the knowledge contained in the KG is meant to be incomplete (Sadeghian et al., 2019) . Consequently, knowledge graph completion(KGC) has received more and more attention, which attempts to predict whether a new triplet is likely to belong to the knowledge graph (KG) by leveraging existing triplets of the KG. Currently, the popular embedding-based KGC methods aim at embedding entities and relations in knowledge graph to a low-dimensional latent feature space. The implicit relationships between entities can be inferred by comparing their representations in this vector space. These researchers (Bordes et al., 2013; Mikolov et al., 2013; Wang et al., 2014; Ji et al., 2015; Lin et al., 2015; Nguyen et al., 2017) make their own contributions for more reasonable and competent embedding. But the overall effect is highly correlated with the density of the knowledge graph. Because embedding method always fails to predict weak and hidden relations which a low frequency. The embedding will converge to a solution that is not suitable for triplets owned weak relations, since the training set for embedding cannot contain all factual triplets.However, reasoning over the hidden relations can covert the testing target to a easier one. For example, there is an existing triplet (Paul, Leader-Of, SoccerTeam) and a rule Leader-Of(x,y) =⇒ Member-Of(x,y) which indicates the leader of a soccer

