EXPRESSIVE: A SPATIO-FUNCTIONAL EMBEDDING FOR KNOWLEDGE GRAPH COMPLETION

Abstract

Knowledge graphs are inherently incomplete. Therefore substantial research has been directed toward knowledge graph completion (KGC), i.e., predicting missing triples from the information represented in the knowledge graph (KG). KG embedding models (KGEs) have yielded promising results for KGC, yet any current KGE is incapable of: (1) fully capturing vital inference patterns (e.g., composition), (2) capturing prominent patterns jointly (e.g., hierarchy and composition), and (3) providing an intuitive interpretation of captured patterns. In this work, we propose ExpressivE, a fully expressive spatio-functional KGE that solves all these challenges simultaneously. ExpressivE embeds pairs of entities as points and relations as hyper-parallelograms in the virtual triple space R 2d . This model design allows ExpressivE not only to capture a rich set of inference patterns jointly but additionally to display any supported inference pattern through the spatial relation of hyper-parallelograms, offering an intuitive and consistent geometric interpretation of ExpressivE embeddings and their captured patterns. Experimental results on standard KGC benchmarks reveal that ExpressivE is competitive with state-of-the-art KGEs and even significantly outperforms them on WN18RR.

1. INTRODUCTION

Knowledge graphs (KGs) are large collections of triples r i (e h , e t ) over relations r i ∈ R and entities e h , e t ∈ E used for representing, storing, and processing information. Real-world KGs such as Freebase (Bollacker et al., 2007) and WordNet (Miller, 1995) lie at the heart of numerous applications such as recommendation (Cao et al., 2019 ), question answering (Zhang et al., 2018 ), information retrieval (Dietz et al., 2018) , and natural language processing (Chen & Zaniolo, 2017). KG Completion. Yet, KGs are inherently incomplete, hindering the immediate utilization of their stored knowledge. For example, 75% of the people represented in Freebase lack a nationality (West et al., 2014) . Therefore, much research has been directed toward the problem of automatically inferring missing triples, called knowledge graph completion (KGC). KG embedding models (KGEs) that embed entities and relations of a KG into latent spaces and quantify the plausibility of unknown triples by computing scores based on these learned embeddings have yielded promising results for KGC (Wang et al., 2017) . Moreover, they have shown excellent knowledge representation capabilities, concisely capturing complex graph structures, e.g., entity hierarchies (Nickel & Kiela, 2017). Inference Patterns. Substantial research has been invested in understanding which KGEs can capture which inference patterns, as summarized in Table 1 . For instance, KGEs such as TransE (Bordes et al., 2013) and RotatE (Sun et al., 2019) can capture fundamental patterns such as composition. Recently, however, it was discovered that these two models can only capture a fairly limited notion of composition (Zhang et al., 2019; Abboud et al., 2020; Lu & Hu, 2020; Gao et al., 2020) , cf. also Appendix K.1. Thus, multiple extensions have been proposed to tackle some of these limitations, focusing, e.g., on modeling non-commutative composition (Lu & Hu, 2020; Gao et al., 2020 ). Yet, while these extensions solved some limitations, the purely functional nature of TransE, RotatE, and any of their extensions still limits them to capture solely compositional definition, not general composition (see Table 1 for the defining formulas, and cf. also Appendix K.1 for details).

