LOSSLESS COMPRESSION OF STRUCTURED CONVOLUTIONAL MODELS VIA LIFTING

Abstract

Lifting is an efficient technique to scale up graphical models generalized to relational domains by exploiting the underlying symmetries. Concurrently, neural models are continuously expanding from grid-like tensor data into structured representations, such as various attributed graphs and relational databases. To address the irregular structure of the data, the models typically extrapolate on the idea of convolution, effectively introducing parameter sharing in their, dynamically unfolded, computation graphs. The computation graphs themselves then reflect the symmetries of the underlying data, similarly to the lifted graphical models. Inspired by lifting, we introduce a simple and efficient technique to detect the symmetries and compress the neural models without loss of any information. We demonstrate through experiments that such compression can lead to significant speedups of structured convolutional models, such as various Graph Neural Networks, across various tasks, such as molecule classification and knowledge-base completion.

1. INTRODUCTION

Lifted, often referred to as templated, models use highly expressive representation languages, typically based in weighted predicate logic, to capture symmetries in relational learning problems (Koller et al., 2007) . This includes learning from data such as chemical, biological, social, or traffic networks, and various knowledge graphs, relational databases and ontologies. The idea has been studied extensively in probabilistic settings under the notion of lifted graphical models (Kimmig et al., 2015) , with instances such as Markov Logic Networks (MLNs) (Richardson & Domingos, 2006) or Bayesian Logic Programs (BLPs) (Kersting & De Raedt, 2001) . In a wider view, convolutions can be seen as instances of the templating idea in neural models, where the same parameterized pattern is being carried around to exploit the underlying symmetries, i.e. some forms of shared correlations in the data. In this analogy, the popular Convolutional Neural Networks (CNN) (Krizhevsky et al., 2012) themselves can be seen as a simple form of a templated model, where the template corresponds to the convolutional filters, unfolded over regular spatial grids of pixels. But the symmetries are further even more noticeable in structured, relational domains with discrete element types. With convolutional templates for regular trees, the analogy covers Recursive Neural Networks (Socher et al., 2013) , popular in natural language processing. Extending to arbitrary graphs, the same notion covers works such as Graph Convolutional Networks (Kipf & Welling, 2016) and their variants (Wu et al., 2019) , as well as various Knowledge-Base Embedding methods (Wang et al., 2017) . Extending even further to relational structures, there are works integrating parameterized relational logic templates with neural networks (Sourek et al., 2018; Rocktäschel & Riedel, 2017; Marra & Kuželka, 2019; Manhaeve et al., 2018) . The common underlying principle of templated models is a joint parameterization of the symmetries, allowing for better generalization. However, standard lifted models, such as MLNs, provide another key advantage that, under certain conditions, the model computations can be efficiently carried out without complete template unfolding, often leading to even exponential speedups (Kimmig et al., 2015) . This is known as "lifted inference" (Kersting, 2012) and is utilized heavily in lifted graphical models as well as database query engines (Suciu et al., 2011) . However, to our best knowledge, this idea has been so far unexploited in the neural (convolutional) models. The main contribution of this paper is thus a "lifting" technique to compress symmetries in convolutional models applied to structured data, which we refer to generically as "structured convolutional models".

1.1. RELATED WORK

The idea for the compression is inspired by lifted inference (Kersting, 2012) used in templated graphical models. The core principle is that all equivalent sub-computations can be effectively carried out in a single instance and broadcasted into successive operations together with their respective multiplicities, potentially leading to significant speedups. While the corresponding "liftable" template formulae (or database queries) generating the isomorphisms are typically assumed to be given (Kimmig et al., 2015) , we explore the symmetries from the unfolded ground structures, similarly to the approximate methods based on graph bisimulation (Sen et al., 2012) . All the lifting techniques are then based in some form of first-order variable elimination (summation), and are inherently designed to explore structural symmetries in graphical models. In contrast, we aim to additionally explore functional symmetries, motivated by the fact that even structurally different neural computation graphs may effectively perform identical function. The learning in neural networks is also principally different from the model counting-based computations in lifted graphical models in that it requires many consecutive evaluations of the models as part of the encompassing iterative training routine. Consequently, even though we assume to unfold a complete computation graph before it is compressed with the proposed technique, the resulting speedup due to the subsequent training is still substantial. From the deep learning perspective, there have been various model compression techniques proposed to speedup the training, such as pruning, decreasing precision, and low-rank factorization (Cheng et al., 2017) . However, to our best knowledge, the existing techniques are lossy in nature, with a recent exception of compressing ReLU networks based on identifying neurons with linear behavior (Serra et al., 2020) . None of these works exploit the model computation symmetries. The most relevant line of work here are Lifted Relational Neural Networks (LRNNs) (Sourek et al., 2018) which however, despite the name, provide only templating capabilities without lifted inference, i.e. with complete, uncompressed ground computation graphs.

2. BACKGROUND

The compression technique described in this paper is applicable to a number of structured convolutional models, ranging from simple recursive (Socher et al., 2013) to fully relational neural models (Sourek et al., 2018) . The common characteristic of the targeted learners is the utilization of convolution (templating), where the same parameterized pattern is carried over different subparts of the data (representation) with the same local structure, effectively introducing repetitive sub-computations in the resulting computation graphs, which we exploit in this work.

2.1. GRAPH NEURAL NETWORKS

Graph neural networks (GNNs) are currently the most prominent representatives of structured convolutional models, which is why we choose them for brevity of demonstration of the proposed compression technique. GNNs can be seen as an extension of the common CNN principles to completely irregular graph structures. Given a particularly structured input sample graph S j , they dynamically unfold a multi-layered computation graph G j , where the structure of each layer i follows the structure of the whole input graph S j . For computation of the next layer i + 1 values, each node v from the input graph S j calculates its own value h(v) by aggregating A ("pooling") the values of the adjacent nodes u : edge(u, v), transformed by some parametric function C W1 ("convolution"), which is being reused with the same parameterization W 1 within each layer i as: h(v) (i) = A (i) ({C (i) W i 1 (h(u) (i-1) )|u : edge(u, v)}) The h(i) (v) can be further combined through another C W2 with the central node's representation from the previous layer to obtain the final updated value h (i) (v) for layer i as: h(v) (i) = C (i) W i 2 (h(v) (i-1) , h(v) (i) )

