AN EFFICIENT MEAN-FIELD APPROACH TO HIGH-ORDER MARKOV LOGIC Anonymous

Abstract

Markov logic networks (MLNs) are powerful models for symbolic reasoning, which combine probabilistic modeling with relational logic. Inference algorithms for MLNs often perform at the level of propositional logic or require building a first-order probabilistic graph, and the computational efficiency remains a challenge. The mean-field algorithm generalizes message passing for approximate inference in many intractable probabilistic graphical models, but in MLNs it still suffers from the high-order dependencies among the massive groundings, resulting in time complexity exponential in both the length and the arity of logic rules. We propose a novel method, LogicMP, to simplify the logic message passing especially. In most practical cases, it can reduce the complexity significantly to polynomial for the formulae in conjunctive normal form (CNF). We exploit the property of CNF logic rules to sidestep the expectation computation of high-order dependency, and then formulate the logic message passing by Einstein summation to facilitate parallel computation, which can be optimized by sequentially contracting the rule arguments. With LogicMP, we achieve evident improvements on several reasoning benchmark datasets in both performance and efficiency over competitor methods. Specifically, the AUC-PR of the UW-CSE and Cora datasets is improved by more than 11% absolutely and the speed is about ten times faster.

1. INTRODUCTION

Despite the remarkable improvement in deep learning, the ability of symbolic learning is believed to be indispensable for the development of modern AI (Besold et al., 2021) . The entities in the real world are interconnected with each other through various relationships, forming massive relational data, which leads to many logic-based models (Koller et al., 2007) . Markov logic networks (MLNs) (Richardson and Domingos, 2006) are among the most well-known methods for symbolic reasoning in relational data, which take advantage of the relational logic and probabilistic graphical models. They use the logic rules to define the potential function of a Markov random field (MRF) and thus soundly handle the uncertainty for the reasoning on various real-world tasks (Zhang et al., 2014; Poon and Domingos, 2007; Singla and Domingos, 2006b; Qu and Tang, 2019) . Typical methods perform inference in MLNs at the level of propositional logic via standard probabilistic inference methods, such as Gibbs sampling (MCMC) (Gilks et al., 1995; Richardson and Domingos, 2006) , slice sampling (MC-SAT) (Poon and Domingos, 2006) , belief propagation (Yedidia et al., 2000) . However, the propositional probabilistic graph is extremely complicated as it typically generates all ground formulae as factors and the number is exponential in the arity of formulae. The methods need to manipulate these factors during the inference stage, leading to an exponential complexity. Besides, several lifted algorithms that treat the whole sets of indistinguishable objects identically were proposed, such as first-order variable elimination (de Salvo Braz et al., 2005) , lifted BP (Singla and Domingos, 2008), lifted MCMC (Niepert, 2012) . However, these lifted methods typically become infeasible when the symmetric structure breaks down, e.g., unique evidence is integrated for each variable. ExpressGNN (Zhang et al., 2020) uses a particular graph neural network as the posterior model in the variational EM training to amortize the problem of joint inference. However, its training complexity is still exponential in the length and arity of formulae. Overall, algorithm design for efficient MLN inference remains a challenge. 1

