LOSSLESS COMPRESSION OF STRUCTURED CONVOLUTIONAL MODELS VIA LIFTING

Abstract

Lifting is an efficient technique to scale up graphical models generalized to relational domains by exploiting the underlying symmetries. Concurrently, neural models are continuously expanding from grid-like tensor data into structured representations, such as various attributed graphs and relational databases. To address the irregular structure of the data, the models typically extrapolate on the idea of convolution, effectively introducing parameter sharing in their, dynamically unfolded, computation graphs. The computation graphs themselves then reflect the symmetries of the underlying data, similarly to the lifted graphical models. Inspired by lifting, we introduce a simple and efficient technique to detect the symmetries and compress the neural models without loss of any information. We demonstrate through experiments that such compression can lead to significant speedups of structured convolutional models, such as various Graph Neural Networks, across various tasks, such as molecule classification and knowledge-base completion.

1. INTRODUCTION

Lifted, often referred to as templated, models use highly expressive representation languages, typically based in weighted predicate logic, to capture symmetries in relational learning problems (Koller et al., 2007) . This includes learning from data such as chemical, biological, social, or traffic networks, and various knowledge graphs, relational databases and ontologies. The idea has been studied extensively in probabilistic settings under the notion of lifted graphical models (Kimmig et al., 2015) , with instances such as Markov Logic Networks (MLNs) (Richardson & Domingos, 2006) or Bayesian Logic Programs (BLPs) (Kersting & De Raedt, 2001) . In a wider view, convolutions can be seen as instances of the templating idea in neural models, where the same parameterized pattern is being carried around to exploit the underlying symmetries, i.e. some forms of shared correlations in the data. In this analogy, the popular Convolutional Neural Networks (CNN) (Krizhevsky et al., 2012) themselves can be seen as a simple form of a templated model, where the template corresponds to the convolutional filters, unfolded over regular spatial grids of pixels. But the symmetries are further even more noticeable in structured, relational domains with discrete element types. With convolutional templates for regular trees, the analogy covers Recursive Neural Networks (Socher et al., 2013) , popular in natural language processing. Extending to arbitrary graphs, the same notion covers works such as Graph Convolutional Networks (Kipf & Welling, 2016) and their variants (Wu et al., 2019) , as well as various Knowledge-Base Embedding methods (Wang et al., 2017) . Extending even further to relational structures, there are works integrating parameterized relational logic templates with neural networks (Sourek et al., 2018; Rocktäschel & Riedel, 2017; Marra & Kuželka, 2019; Manhaeve et al., 2018) . The common underlying principle of templated models is a joint parameterization of the symmetries, allowing for better generalization. However, standard lifted models, such as MLNs, provide another key advantage that, under certain conditions, the model computations can be efficiently carried out without complete template unfolding, often leading to even exponential speedups (Kimmig et al., 2015) . This is known as "lifted inference" (Kersting, 2012) and is utilized heavily in lifted graphical models as well as database query engines (Suciu et al., 2011) . However, to our best knowledge, this idea has been so far unexploited in the neural (convolutional) models. The main contribution of

