AN EFFICIENT MEAN-FIELD APPROACH TO HIGH-ORDER MARKOV LOGIC Anonymous

Abstract

Markov logic networks (MLNs) are powerful models for symbolic reasoning, which combine probabilistic modeling with relational logic. Inference algorithms for MLNs often perform at the level of propositional logic or require building a first-order probabilistic graph, and the computational efficiency remains a challenge. The mean-field algorithm generalizes message passing for approximate inference in many intractable probabilistic graphical models, but in MLNs it still suffers from the high-order dependencies among the massive groundings, resulting in time complexity exponential in both the length and the arity of logic rules. We propose a novel method, LogicMP, to simplify the logic message passing especially. In most practical cases, it can reduce the complexity significantly to polynomial for the formulae in conjunctive normal form (CNF). We exploit the property of CNF logic rules to sidestep the expectation computation of high-order dependency, and then formulate the logic message passing by Einstein summation to facilitate parallel computation, which can be optimized by sequentially contracting the rule arguments. With LogicMP, we achieve evident improvements on several reasoning benchmark datasets in both performance and efficiency over competitor methods. Specifically, the AUC-PR of the UW-CSE and Cora datasets is improved by more than 11% absolutely and the speed is about ten times faster.

1. INTRODUCTION

Despite the remarkable improvement in deep learning, the ability of symbolic learning is believed to be indispensable for the development of modern AI (Besold et al., 2021) . The entities in the real world are interconnected with each other through various relationships, forming massive relational data, which leads to many logic-based models (Koller et al., 2007) . Markov logic networks (MLNs) (Richardson and Domingos, 2006) are among the most well-known methods for symbolic reasoning in relational data, which take advantage of the relational logic and probabilistic graphical models. They use the logic rules to define the potential function of a Markov random field (MRF) and thus soundly handle the uncertainty for the reasoning on various real-world tasks (Zhang et al., 2014; Poon and Domingos, 2007; Singla and Domingos, 2006b; Qu and Tang, 2019) . Typical methods perform inference in MLNs at the level of propositional logic via standard probabilistic inference methods, such as Gibbs sampling (MCMC) (Gilks et al., 1995; Richardson and Domingos, 2006) , slice sampling (MC-SAT) (Poon and Domingos, 2006) , belief propagation (Yedidia et al., 2000) . However, the propositional probabilistic graph is extremely complicated as it typically generates all ground formulae as factors and the number is exponential in the arity of formulae. The methods need to manipulate these factors during the inference stage, leading to an exponential complexity. Besides, several lifted algorithms that treat the whole sets of indistinguishable objects identically were proposed, such as first-order variable elimination (de Salvo Braz et al., 2005) , lifted BP (Singla and Domingos, 2008), lifted MCMC (Niepert, 2012) . However, these lifted methods typically become infeasible when the symmetric structure breaks down, e.g., unique evidence is integrated for each variable. ExpressGNN (Zhang et al., 2020) uses a particular graph neural network as the posterior model in the variational EM training to amortize the problem of joint inference. However, its training complexity is still exponential in the length and arity of formulae. Overall, algorithm design for efficient MLN inference remains a challenge. The circles are the ground atoms where the shaded ones are latent and the blocks like f 1 (A, B) denote the ground formulae. Right: We expand the MLN inference into several mean-field iterations and each iteration is implemented by a LogicMP layer. The calculation of the message w.r.t. a single grounding is illustrated in the "grounding message" block with dashed lines: the message from S(A) and F(A, B) w.r.t. f 1 (A, B) can be simplified as Q S(A) (1)Q F(A,B) (1). We then formulate parallel aggregation on the right via the Einstein summation (Einsum). Specifically, the marginals of ground atoms are grouped by the predicates as the basic units of computation (the gray border denotes the marginals with ¬). In each iteration, LogicMP takes them as input and performs Einsum for each logic rule (shown in the middle block). Intuitively, Einsum counts the groundings that derive the hypothesis for each implication statement of the logic rules (cf. Sec. 4). The outputs of Einsum are then used to update the grouped marginals. Such procedure loops for T steps until convergence. Note that f 2 is in conjunctive normal form and its two clauses can be treated separately. The Einsum is also applicable for the predicates with more than two arguments. (A,A) (A,B) (B,A) (B,B) A B 𝙵(a, b) 𝚂(a) ⋅ a , ab A B 𝚂(b) b → → ∧ → (A,A) (A,B) (B,A) (B,B) A B ¬𝙵(a, b) 𝚂(a) ⋅ a , ab A B ¬𝚂(b) b → → → ∧ (A,A) (A,B) (B,A) (B,B) A B 𝙵(a, b) ¬𝚂(b) ⋅ b , ab A B ¬𝚂(a) a → → ∧ → A B 𝙲(a) a → → → A B 𝚂(a) a A B ¬𝚂(a) a → → → A B ¬𝙲(a) a (A,A) (A,B) (B,A) (B,B) 𝙵( ⋅ , ⋅ ) A B 𝚂( ⋅ ) A B 𝙲( ⋅ ) (A,A) (A,B) (B,A) (B,B) 𝙵( ⋅ , ⋅ ) A B 𝚂( ⋅ ) A B 𝙲( ⋅ ) 𝙵(A, A) = ? 𝙵(A, B) = ? 𝙵(B, A) = 1 𝙵(B, To mitigate the inference problem of MLNs, we adopt the mean-field (MF) algorithm (Wainwright and Jordan, 2008; Koller and Friedman, 2009) to reason over the relational data approximately. The MF algorithm updates the marginals of variables by calculating the expected potential over other related variables. Although the iteration of the MF algorithm can be formulated as modern neural networks for several special conditional random fields (CRFs) (Zheng et al., 2015; Vemulapalli et al., 2016) , its standard implementation of MLN is still inefficient due to expectation calculation over high-order related variables and massive groundings, resulting in a computational complexity exponential in the length and the arity of formulae, respectively. We show that the MF iteration in MLN can be efficiently formulated as a special neural network LogicMP, which propagates the messages for the logic rules especially. In most practical cases, for the logic rules in conjunctive normal form (CNF), it can remove both exponents to effectively reduce the complexity to polynomial. Fig. 1 gives an overview of our approach with an example. For a given MLN (left), the MF algorithm unfolds the MLN inference into several iterations of the forward computation (right). Specifically, LogicMP tackles the two essential problems of the standard update from two perspectives. First, we show that in the MF update the expectation calculation over the high-order related variables of the logic rules can be removed owing to the property of CNF logic rules, and only one remaining state needs to consider which forms an implication path from the related variables (as the premise) to the updated variable (as the hypothesis). The block with the dashed line across both sides in Fig. 1 illustrates how to compute a message w.r.t. a single ground formula. Second, based on this finding, we further show that the computation and aggregation of massive groundings can be efficiently implemented by the Einstein summation (Einsum). It can not only aggregate the grounding messages in parallel but also indicate a way for complexity reduction. The Einsum operation can be optimized in general when the arguments are contracted sequentially. For instance, the MF iteration of chain rules can be implemented in cubic polynomial complexity regardless of the exponential amount of groundings in the arity of formulae. The experimental results on the benchmark datasets prove the effectiveness and efficiency of LogicMP, boosting the average AUC-PR by 11+% absolutely against many competitive methods with much faster training speed (about 10×) on the UW-CSE (Richardson and Domingos, 2006) and Cora (Singla and Domingos, 2005) datasets. Our reproducible source code is attached in the supplementary file and will be released publicly.



Figure 1: Left: A Markov logic network (MLN) with two entities {A, B}, predicates F (Friend) and S (Smoke) and C (Cancer), and formulae f 1 (a, b) := ¬S(a) ∨ ¬F(a, b) ∨ S(b), f 2 (a) := (¬S(a) ∨ C(a)) ∧ (S(a) ∨ ¬C(a)).The circles are the ground atoms where the shaded ones are latent and the blocks like f 1 (A, B) denote the ground formulae. Right: We expand the MLN inference into several mean-field iterations and each iteration is implemented by a LogicMP layer. The calculation of the message w.r.t. a single grounding is illustrated in the "grounding message" block with dashed lines: the message from S(A) and F(A, B) w.r.t. f 1 (A, B) can be simplified as Q S(A) (1)Q F(A,B) (1). We then formulate parallel aggregation on the right via the Einstein summation (Einsum). Specifically, the marginals of ground atoms are grouped by the predicates as the basic units of computation (the gray border denotes the marginals with ¬). In each iteration, LogicMP takes them as input and performs Einsum for each logic rule (shown in the middle block). Intuitively, Einsum counts the groundings that derive the hypothesis for each implication statement of the logic rules (cf. Sec. 4). The outputs of Einsum are then used to update the grouped marginals. Such procedure loops for T steps until convergence. Note that f 2 is in conjunctive normal form and its two clauses can be treated separately. The Einsum is also applicable for the predicates with more than two arguments.

