LOGICAL ENTITY REPRESENTATION IN KNOWLEDGE-GRAPHS FOR DIFFERENTIABLE RULE LEARNING

Abstract

Probabilistic logical rule learning has shown great strength in logical rule mining and knowledge graph completion. It learns logical rules to predict missing edges by reasoning on existing edges in the knowledge graph. However, previous efforts have largely been limited to only modeling chain-like Horn clauses such as R 1 (x, z) ∧ R 2 (z, y) ⇒ H(x, y). This formulation overlooks additional contextual information from neighboring sub-graphs of entity variables x, y and z. Intuitively, there is a large gap here, as local sub-graphs have been found to provide important information for knowledge graph completion. Inspired by these observations, we propose Logical Entity RePresentation (LERP) to encode contextual information of entities in the knowledge graph. A LERP is designed as a vector of probabilistic logical functions on the entity's neighboring sub-graph. It is an interpretable representation while allowing for differentiable optimization. We can then incorporate LERP into probabilistic logical rule learning to learn more expressive rules. Empirical results demonstrate that with LERP, our model outperforms other rule learning methods in knowledge graph completion and is comparable or even superior to state-of-the-art black-box methods. Moreover, we find that our model can discover a more expressive family of logical rules. LERP can also be further combined with embedding learning methods like TransE to make it more interpretable. 1

1. INTRODUCTION

In recent years, the use of logical formulation has become prominent in knowledge graph (KG) reasoning and completion (Teru et al., 2020; Campero et al., 2018; Payani & Fekri, 2019) , mainly because a logical formulation can be used to enforce strong prior knowledge on the reasoning process. In particular, probabilistic logical rule learning methods (Sadeghian et al., 2019; Yang et al., 2017) have shown further desirable properties including efficient differentiable optimization and explainable logical reasoning process. These properties are particularly beneficial for KGs since KGs are often large in size, and modifying KGs has social impacts so rationales are preferred by human readers. 

Due

r 1 (x, z 1 ) ∧ r 2 (z 1 , z 2 ) ∧ • • • ∧ r K (z K-1 , y) ⇒ H(x, y), where r k represents relations and x, y z k represent entities in the graph. Even though this formulation is computationally efficient (see Section 3), it overlooks potential contextual information coming from local sub-graphs neighboring the entities (variables x, y, and all z i ). However, this kind of contextual information can be important for reasoning on knowledge graphs. Figure 1 (b) shows an example. If we only know that z is mother of x and y, we are not able to infer if y is a brother or sister of x. However, in Figure 1 (c), with the contextual logical information that ∃z is son of(y, z ) we can infer that y is a male and that y should be the brother rather than the sister of x. (b) without contextual information (c) with logical contextual information In this paper, we propose Logical Entity RePresentation (LERP) to incorporate information from local sub-graphs into probabilistic logic rule learning. LERP is a logical contextual representation for entities in knowledge graph. For an entity e, a LERP L(e) is designed as a vector of logical functions L i (e) on e's neighboring sub-graph and the enclosed relations. We then incorporate LERP in probabilistic logical rule learning methods to provide contextual information for entities. Different from other embedding learning methods, LERP encodes contextual information rather than identity information of entities. In the example discussed above in Figure 1 , an ideal LERP for y might contain logical functions like L i (y) = ∃z is son of(y, z ). Therefore, when predicting the relation is brother of, the rule learning model can select L i from LERP and get the rule written in Figure 1(c ). In our model, LERP can be jointly optimized with probabilistic logical rules. We empirically show that our model outperforms previous logical rule learning methods on knowledge graph completion benchmarks. We also find that LERP allows our model to compete, and sometimes even exceed, strong black-box baselines. Moreover, LERP is itself an interpretable representation, so our model is able to discover more complex logical rules from data. In Section 5.4 we demonstrate that LERP can also be combinedd with embedding-learning models like TransE (Bordes et al., 2013) to construct a hybrid model that learns interpretable embeddings. x z y is_mother_of is_mother_of is_brother_of / is_sister_of ? z′ is_son_of Logical rule: is_mother_of is_mother_of is_son_of is_brother_of (z, x) ∧ (z, y) ∧ ∃z′ : (y, z′ )⇒ (y, x) Facts: is the mother of and {is_mother_of , is_mother_of } z x y (z, x) (z,

2. RELATED WORK

Logical Rule Learning This work is closely related to the problem of learning logical rules for knowledge graph reasoning, and thus also related to the inductive logic programming (ILP) field. Traditional rule learning approaches search for the logical rules with heuristic metrics such as support and confidence, and then learn a scalar weight for each rule. Representative methods include Markov Logic Networks (Richardson & Domingos, 2006; Khot et al., 2011) , relational dependency networks (Neville & Jensen, 2007; Natarajan et al., 2010) , rule mining algorithms (Galárraga et al., 2013; Meilicke et al., 2019) , path finding and ranking approaches (Lao & Cohen, 2010; Lao et al., 2011; Chen et al., 2016) , probabilistic personalized page rank (ProPPR) models (Wang et al., 2013; 2014a ), ProbLog (De Raedt et al., 2007) , CLP(BN) (Costa et al., 2002 ), SlipCover (Bellodi & Riguzzi, 2015 ), ProbFoil (De Raedt et al., 2015) and SafeLearner (Jain, 2019). However, most traditional methods face the problem of large searching space of logical rules, or rely on predefined heuristics to guide searching. These heuristic measures are mostly designed by humans, and may not necessarily be generalizable to different tasks. Differentiable Logical Rule Learning Recently, another trend of methods propose jointly learning the logical rule form and the weights in a differentiable manner. Representative models include the embedding-based method of Yang et al. (2015 ), Neural LP (Yang et al., 2017) , DRUM (Sadeghian et al., 2019) and RLvLR (Omran et al., 2018) . Furthermore, some efforts have applied reinforcement learning to rule learning. The idea is to train agents to search for paths in the knowledge graph connecting the start and destination entities. Then, the connecting path can be ex-



All code and data are publicly available at https://github.com/Glaciohound/LERP.



to the large search space of logical rules, recent efforts Sadeghian et al. (2019); Yang et al. (2017); Payani & Fekri (2019) focus on learning chain-like Horn clauses of the following form:

Figure 1: Incorporating contextual information of entities can benefit missing link prediction.Recent deep neural network models based on Graph Neural Networks (GNNs)(Teru et al., 2020;  Mai et al., 2021)  have utilized local sub-graphs as an important inductive bias in knowledge graph completion. Although GNNs can efficiently incorporate neighboring information via messagepassing mechanisms to improve prediction performance(Zhang et al., 2019; Lin et al., 2022), they are not capable of discovering explicit logical rules, and the reasoning process of GNNs is largely unexplainable.

