LOGICAL MESSAGE PASSING NETWORKS WITH ONE-HOP INFERENCE ON ATOMIC FORMULAS

Abstract

Complex Query Answering (CQA) over Knowledge Graphs (KGs) has attracted a lot of attention to potentially support many applications. Given that KGs are usually incomplete, neural models are proposed to answer the logical queries by parameterizing set operators with complex neural networks. However, such methods usually train neural set operators with a large number of entity and relation embeddings from the zero, where whether and how the embeddings or the neural set operators contribute to the performance remains not clear. In this paper, we propose a simple framework for complex query answering that decomposes the KG embeddings from neural set operators. We propose to represent the complex queries into the query graph. On top of the query graph, we propose the Logical Message Passing Neural Network (LMPNN) that connects the local one-hop inferences on atomic formulas to the global logical reasoning for complex query answering. We leverage existing effective KG embeddings to conduct one-hop inferences on atomic formulas, the results of which are regarded as the messages passed in LMPNN. The reasoning process over the overall logical formulas is turned into the forward pass of LMPNN that incrementally aggregates local information to finally predict the answers' embeddings. The complex logical inference across different types of queries will then be learned from training examples based on the LMPNN architecture. Theoretically, our query-graph representation is more general than the prevailing operator-tree formulation, so our approach applies to a broader range of complex KG queries. Empirically, our approach yields a new state-of-the-art neural CQA model. Our research bridges the gap between complex KG query answering tasks and the long-standing achievements of knowledge graph representation learning. Our implementation can be found at

1. INTRODUCTION

Knowledge Graphs (KG) are essential sources of factual knowledge supporting downstream tasks such as question answering (Zhang et al., 2018; Sun et al., 2020; Ren et al., 2021) . Answering logical queries is a complex but important task to utilize the given knowledge (Ren & Leskovec, 2020; Ren et al., 2021) . Modern Knowledge Graphs (KG) (Bollacker et al., 2008; Suchanek et al., 2007; Carlson et al., 2010) , though on a great scale, is usually considered incomplete. This issue is well known as the Open World Assumption (OWA) (Ji et al., 2021) . Representation learning methods are employed to mitigate the incompleteness issue by learning representations from the observed KG triples and generalizing them to unseen triples (Bordes et al., 2013; Trouillon et al., 2016; Sun et al., 2018; Zhang et al., 2019; Chami et al., 2020) . When considering logical queries over incomplete knowledge graphs, the query answering models are required to not only predict the unseen knowledge but also execute logical operators, such as conjunction, disjunction, and negation (Ren & Leskovec, 2020; Wang et al., 2021b) . Recently, neural models for Complex Query Answering (CQA) have been proposed to complete the unobserved knowledge graph and answer the complex query simultaneously. These models aim to address complex queries that belong to an important subset of the first-order queries. Formally speaking, the complex queries are Existentially quantified First Order queries and has a single free variable (EFO-1) (Wang et al., 2021b) containing logical conjunction, disjunction, and negation (Ren & Leskovec, 2020) . The EFO-1 queries are transformed in the forms of operator trees, e.g., relational set projection, set intersection, set union, and set complement (Wang et al., 2021b) . The key idea of these approaches is to represent the entity set into specific embedding spaces (Ren & Leskovec, 2020; Zhang et al., 2021; Chen et al., 2022) . Then, the set operators are parameterized by neural networks (Ren & Leskovec, 2020; Amayuelas et al., 2022; Bai et al., 2022) . The strict execution of the set operations can be approximated by learning and conducting continuous mappings over the embedding spaces. It is observed by experiments that classic KG representation (Trouillon et al., 2016) can easily outperform the neural CQA models in one-hop queries even though the neural CQA models model the one-hop projection with complex neural networks (Ren & Leskovec, 2020; Amayuelas et al., 2022; Bai et al., 2022) . One possible reason is that the neural set projection is sub-optimal in modeling the inherent relational properties, such as symmetry, asymmetry, inversion, composition, etc, which are sufficiently discussed in KG completion tasks and addressed by KG representations (Trouillon et al., 2016; Sun et al., 2018) . On the other hand, Continuous Query Decomposition (CQD) (Arakelyan et al., 2021) method searches for the best answers with a pretrained KG representation. The logical inference step is modeled as an optimization problem where the continuous truth value of an Existential Positive First Order (EPFO) query is maximized by altering the variable embeddings. However, the speed and the performance of inference heavily rely on the optimization algorithm. It also assumes that the embeddings of entities and relations can reflect higher-order logical relations, which is not generally assumed in existing knowledge graph representation models. Moreover, it is unclear whether CQD can achieve good performance on complex queries with negation operatorsfoot_0 . In this paper, we aim to answer complex EFO-1 queries by equipping pretrained KG representations with logical inference power. First, we formulate the EFO-1 KG queries as Disjunctive Normal Form (DNF) formulas and propose to represent the conjunctive queries in the form of query graphs. In the query graph, each edge is an atomic formula that contains a predicate with a (possible) negation operator. For each one-hop atomic formula, we use the pretrained KG representation to infer the intermediate embeddings given the neighboring entity embedding, relation embedding, direction information, and negation information. We show that the inference can be analytically derived for the KG representation formulation. The results of one-hop atomic formula inference are interpreted as the logical messages passed from one node to another. Based on this mechanism, we propose a Logical Message Passing Neural Network (LMPNN), where node embeddings are updated by one Multi-Layer Perceptron (MLP) based on aggregated logical messages. LMPNN coordinates the local logical message by pretrained knowledge graph representations and predicts the answer embedding for a complex EFO-1 query. Instead of performing on-the-fly optimization over the query graph as CQD (Arakelyan et al., 2021) , we parameterize the query answering process as the forward pass of LMPNN which is trained from the observed KG query samples. Extensive experiments show that our approach is a new state-of-the-art neural CQA model, in which only one MLP network and two embedding vectors are trained. Interestingly, we show that the optimal number of layers of the LMPNN is the largest distance between the free variable node and the constant entity nodes. This makes it easy to generalize our approach to complex queries of arbitrary complexity. Hence, our approach bridges the gap between complex KG query answering tasks and the long-standing achievements of knowledge graph representation learning.

2. RELATED WORKS

Knowledge graph representation. Representing relational knowledge is one of the long-standing topics in representation learning. Knowledge graph representations aim to predict unseen relational triples by representing the discrete symbols in continuous spaces. Various algebraic structures (Bordes et al., 2013; Trouillon et al., 2016; Sun et al., 2018; Ebisu & Ichise, 2018; Zhang et al., 2019) are applied to represent the relational patterns (Sun et al., 2018) and different geometric spaces (Chami et al., 2020; Cao et al., 2022) are explored to capture the hierarchical structures in knowledge graphs. Therefore, entities and relations in large knowledge graphs can be efficiently represented in a continuous space.



Existing empirical evaluations are all conducted on queries without negation(Arakelyan et al., 2021)

availability

https://github.com/HKUST-KnowComp

