BEYOND LINK PREDICTION: ON PRE-TRAINING KNOWLEDGE GRAPH EMBEDDINGS

Abstract

Knowledge graph embeddings (KGE) models provide low-dimensional representations of entities and relations in a knowledge graph (KG). Most prior work focuses on training and evaluating KGE models for the task of link prediction; the question of whether or not KGE models provide useful representations more generally remains largely open. In this work, we explore the suitability of KGE models (i) for more general graph-structure prediction tasks and (ii) for downstream tasks such as entity classification. For (i), we found that commonly trained KGE models often perform poorly at structural tasks other than link prediction. Based on this observation, we propose a more general multi-task training approach, which includes additional self-supervised tasks such as neighborhood prediction or domain prediction. In our experiments, these multi-task KGE models showed significantly better overall performance for structural prediction tasks. For (ii), we investigate whether KGE models provide useful features for a variety of downstream tasks. Here we view KGE models as a form of self-supervised pre-training and study the impact of both model training and model selection on downstream task performance. We found that multi-task pre-training can (but does not always) significantly improve performance and that KGE models can (but do not always) compete with or even outperform task-specific GNNs trained in a supervised fashion. Our work suggests that more research is needed on the relation between pretraining KGE models and their suitability for downstream applications.

1. INTRODUCTION

Knowledge graph embeddings (KGE) provide low-dimension representations of entities and relations of a knowledge graph (KG). Although a large number of KGE models have been proposed in the literature-see for example the surveys of Nickel et al. (2015) , Wang et al. (2017) and Ji et al. (2021)-, most prior work focuses on the task of link prediction, i.e., answering questions such as (Austin, capitalOf, ?) by reasoning over an incomplete KB. In addition to link prediction, it is often argued that KGEs can provide representations that capture semantic properties of the entities and, indeed, pre-trained KGE models have been used to inject structured knowledge into language models (He et al., 2020; Zhang et al., 2019 ), visual models (Baier et al., 2017) , recommender systems (El-Kishky et al., 2022; Wang et al., 2018) , question answering systems (Ilyas et al., 2022) and other types of downstream models (Wang et al., 2017) . The question of whether pre-trained KGE models provide generally useful representations remains largely open. Likewise, it is not well-understood how choices taken in model training and model selection affect these representations. In this work, we shed light onto these questions from multiple directions. First, we study the suitability of out-of-the-box KGE models for basic graph-structure prediction tasks beyond link prediction. In particular, we consider the tasks of predicting the relation of a triple as suggested by Chang et al. ( 2020) (e.g., the relationship between Austin and Texas), the domain and range of a relation (e.g., whether Austin is a capital), as well as entity and relation neighborhood of each entity (e.g., which other entities are related to Austin). Perhaps surprisingly, we found that commonly trained KGE models often performed poorly on such tasks, challenging the intuition that KGE models capture graph structure well.

