CAN NEURAL NETWORKS LEARN IMPLICIT LOGIC FROM PHYSICAL REASONING?

Abstract

Despite the success of neural network models in a range of domains, it remains an open question whether they can learn to represent abstract logical operators such as negation and disjunction. We test the hypothesis that neural networks without inherent inductive biases for logical reasoning can acquire an implicit representation of negation and disjunction. Here, implicit refers to limited, domainspecific forms of these operators, which work in psychology suggests may be a precursor (developmentally and evolutionarily) to the type of abstract, domaingeneral logic that is characteristic of adult humans. To test neural networks, we adapt a test designed to diagnose the presence of negation and disjunction in animals and pre-verbal children, which requires inferring the location of a hidden object using constraints of the physical environment as well as implicit logic: if a ball is hidden in A or B, and shown not to be in A, can the subject infer that it is in B? Our results show that, despite the neural networks learning to track objects behind occlusion, they are unable to generalize to a task that requires implicit logic. We further show that models are unable to generalize to the test task even when they are trained directly on a logically identical (though visually dissimilar) task. However, experiments using transfer learning reveal that the models do recognize structural similarity between tasks which invoke the same logical reasoning pattern, suggesting that some desirable abstractions are learned, even if they are not yet sufficient to pass established tests of logical reasoning.

1. INTRODUCTION

People have the capacity for flexible logical reasoning. For example, given two alternatives (A or B), and subsequent information that allows ruling out one of them (not A), people can conclude that the other is true with certainty (therefore B), termed reasoning by exclusion. It is an open question whether achieving similar reasoning with neural models will require explicit logical components to be built into the network architecture or if the capacity for such reasoning can be learned from data. Prior work on logical reasoning in neural networks (Marcus, 2001; Evans et al., 2018, among others) has focused on whether models are able to acquire abstract, domain-general logical operators, such as the ¬ and ∨ found in first order logic. However, recent psychological studies of logic in non-human animals and human infants have suggested that this powerful reasoning machinery does not appear in adults fully formed, ex nihilo. Rather, this work has argued that both over the human lifespan and across species, domain-general logical operators may develop and evolve from scaffolding provided by precursors that are themselves more limited. These precursors are implicit logical operators that can differ from the explicit, domain-general forms in two ways: they might only be able to operate on content in specific domains, and they might perform only some of the functional role of their full-fledged counterparts (Völter & Call, 2017; Bermudez, 2003; Cesana-Arlotti et al., 2018) , (see Feiman et al., 2022, for discussion) . In this work, we ask whether neural network models, which lack explicit representations of the logical operators for negation and disjunction, can nonetheless acquire implicit representations of such operators via self-supervised training. In particular, we focus on implicit logic within the domain

