RETHINKING IDENTITY IN KNOWLEDGE GRAPH EM-BEDDING

Abstract

Knowledge Graph Embedding (KGE) is a common method to complete real-world Knowledge Graphs (KGs) by learning the embeddings of entities and relations. Beyond specific KGE models, previous work proposes a general framework based on group. A group has a special element identity that uniquely corresponds to the relation identity in KGs, which implies that identity should be represented uniquely. However, we find that this uniqueness cannot be modeled by bilinear based models, revealing the inconsistency between the framework and models. To this end, we propose a solution named Unit Ball Bilinear Model (UniBi). In addition to theoretical superiority, it has greater interpretability and improves performance by preventing ineffective learning with the least constraints. Experiments demonstrate that UniBi models the uniqueness and verify its interpretability and performance.

1. INTRODUCTION

Knowledge Graphs (KGs) store human knowledge in the form of triple (h, r, t), which represents a relation r between a head entity h and a tail entity t (Ji et al., 2021) . KGs benefit a lots of downstream tasks and applications, e.g., recommender system (Zhang et al., 2016 ), dialogue system (He et al., 2017) and question answering (Mohammed et al., 2018) . Since actual KGs are usually incomplete, researchers are interested in predicting missing links to complete them. As a common solution, Knowledge Graph Embedding (KGE) completes KGs by learning low-dimensional representations of entities and relations. Beyond the great advances in specific KGE models (Trouillon et al., 2016; Hitchcock, 1927; Chami et al., 2020; Liu et al., 2017; Nickel et al., 2011; Bordes et al., 2013) , several works also attempt to unify these models with general frameworks, such as promising ones based on group (Yang et al., 2020; Xu & Li, 2019; Ebisu & Ichise, 2018) . Group is an abstraction of an operations on a set, like addition on integer. Just like such case has a special number 0, each group has a unique element identity. From the perspective of group, such element requires that its correspondence in KGs, identity relation , should be represented uniquely. However, we find that such uniqueness cannot be modeled by bilinear based models, which reveals the inconsistency between the framework and models. To present the problem more clearly, we first need to introduce some notation. A model with a score function s(h, r, t) can model the uniqueness of identity means that ∀h ̸ = t, s(h, r, h) > s(h, r, t) holds if and only if r is identity and its universal representation is unique. In addition, the score function s(•) of bilinear based model is h ⊤ Rt, where h, R, t are the representations of h, r, and t. In terms of such uniqueness, bilinear based models have two flaws. On the one hand, Fig. 1(a ) demonstrates e ⊤ 1 Ie 1 < e ⊤ 1 Ie 2 , which means that the relation matrices per se do not model identity perfectly. On the other hand, Fig. 1(b) shows even if a matrix, e.g. I, does. Its scaled one kI can also model identity and thus breaks the uniqueness. Obviously, modeling this property requires both entities and relations to be restricted, which reduces expressiveness. To avoid this side effect, we make the cost negligible by minimizing the constraints, one per entity or relation, while modeling the desired property. To be specific, we normalize the vectors of the entities and the spectral radius of the matrices of the relations to 1. Since the model captures entities in a unit ball as shown in Fig. 1 (c), we name it Unit Ball Bilinear Model (UniBi)

