DEEP ADAPTIVE SEMANTIC LOGIC (DASL): COMPIL-ING DECLARATIVE KNOWLEDGE INTO DEEP NEURAL NETWORKS

Abstract

We introduce Deep Adaptive Semantic Logic (DASL), a novel framework for automating the generation of deep neural networks that incorporates user-provided formal knowledge to improve learning from data. We provide formal semantics that demonstrate that our knowledge representation captures all of first order logic and that finite sampling from infinite domains converges to correct truth values. DASL's representation improves on prior neuro-symbolic work by avoiding vanishing gradients, allowing deeper logical structure, and enabling richer interactions between the knowledge and learning components. We illustrate DASL through a toy problem in which we add structure to an image classification task and demonstrate that knowledge of that structure reduces data requirements by a factor of 1000. We apply DASL on a visual relationship detection task and demonstrate that the addition of commonsense knowledge improves performance by 10.7% in conditions of data scarcity.

1. INTRODUCTION

Early work on Artificial Intelligence focused on Knowledge Representation and Reasoning (KRR) through the application of techniques from mathematical logic [Genesereth & Nilsson (1987) ]. The compositionality of KRR techniques provides expressive power for capturing expert knowledge in the form of rules or assertions (declarative knowledge), but they are brittle and unable to generalize or scale. Recent work has focused on Deep Learning (DL), in which the parameters of complex functions are estimated from data [LeCun et al. (2015) ]. DL techniques learn to recognize patterns not easily captured by rules and generalize well from data, but they often require large amounts of data for learning and in most cases do not reason at all [Yang et al. (2017); Garcez et al. (2012) ; Marcus (2018); Weiss et al. (2016) ]. In this paper we present [Deep Adaptive Semantic Logic (DASL)], a framework that attempts to take advantage of the complementary strengths of KRR and DL by fitting a model simultaneously to data and declarative knowledge. DASL enables robust abstract reasoning and application of domain knowledge to reduce data requirements and control model generalization. DASL represents declarative knowledge as assertions in first order logic. The relations and functions that make up the vocabulary of the domain are implemented by neural networks that can have arbitrary structure. The logical connectives in the assertions compose these networks into a single deep network that is trained to maximize their truth. Figure 1 provides an example network that implements a simple rule set through composition of network components performing image classification. Logical quantifiers "for all" and "there exists" generate subsamples of the data on which the network is trained. DASL treats labels like assertions about data, removing any distinction between knowledge and data. This provides a mechanism by which supervised, semi-supervised, unsupervised, and distantly supervised learning can take place simultaneously in a single network under a single training regime. The field of neuro-symbolic computing [Garcez et al. (2019) ] focuses on combining logical and neural network techniques in general, and the approach of [Serafini & Garcez (2016) ] may be the closest of any prior work to DASL. To generate differentiable functions to support backpropagation, these approaches replace pure Boolean values of 0 and 1 for True and False with continuous values from [0, 1] and select fuzzy logic operators for implementing the Boolean connectives. These operators generally employ maximum or minimum functions, removing all gradient information at the limits,

