DOES INJECTING LINGUISTIC STRUCTURE INTO LAN-GUAGE MODELS LEAD TO BETTER ALIGNMENT WITH BRAIN RECORDINGS?

Abstract

Neuroscientists evaluate deep neural networks for natural language processing as possible candidate models for how language is processed in the brain. These models are often trained without explicit linguistic supervision, but have been shown to learn some linguistic structure in the absence of such supervision (Manning et al., 2020), potentially questioning the relevance of symbolic linguistic theories in modeling such cognitive processes (Warstadt & Bowman, 2020). We evaluate across two fMRI datasets whether language models align better with brain recordings, if their attention is biased by annotations from syntactic or semantic formalisms. Using structure from dependency or minimal recursion semantic annotations, we find alignments improve significantly for one of the datasets. For another dataset, we see more mixed results. We present an extensive analysis of these results. Our proposed approach enables the evaluation of more targeted hypotheses about the composition of meaning in the brain, expanding the range of possible scientific inferences a neuroscientist could make, and opens up new opportunities for cross-pollination between computational neuroscience and linguistics.

1. INTRODUCTION

Recent advances in deep neural networks for natural language processing (NLP) have generated excitement among computational neuroscientists, who aim to model how the brain processes language. These models are argued to better capture the complexity of natural language semantics than previous computational models, and are thought to represent meaning in a way that is more similar to how it is hypothesized to be represented in the human brain. For neuroscientists, these models provide possible hypotheses for how word meanings compose in the brain. Previous work has evaluated the plausibility of such candidate models by testing how well representations of text extracted from these models align with brain recordings of humans during language comprehension tasks (Wehbe et al., 2014; Jain & Huth, 2018; Gauthier & Ivanova, 2018; Gauthier & Levy, 2019; Abnar et al., 2019; Toneva & Wehbe, 2019; Schrimpf et al., 2020; Caucheteux & King, 2020) , and found some correspondences. However, modern NLP models are often trained without explicit linguistic supervision (Devlin et al., 2018; Radford et al., 2019) , and the observation that they nevertheless learn some linguistic structure has been used to question the relevance of symbolic linguistic theories. Whether injecting such symbolic structures into language models would lead to even better alignment with cognitive measurements, however, has not been studied. In this work, we address this gap by training BERT ( §3.1) with structural bias, and evaluate its alignment with brain recordings ( §3.2). Structure is derived from three formalisms-UD, DM and UCCA ( §3.3)-which come from different linguistic traditions, and capture different aspects of syntax and semantics. Our approach, illustrated in Figure 1 , allows for quantifying the brain alignment of the structurallybiased NLP models in comparison to the base models, as related to new information about linguistic structure learned by the models that is also potentially relevant to language comprehension in the brain. More specifically, in this paper, we: (a) Employ a fine-tuning method utilising structurally guided attention for injecting structural bias into language model (LM) representations. (c) Further evaluate the LMs on a range of targeted syntactic probing tasks and a semantic tagging task, which allow us to uncover fine-grained information about their structuresensitive linguistic capabilities. (d) Present an analysis of various linguistic factors that may lead to improved or deteriorated brain alignment. 2 BACKGROUND: BRAIN ACTIVITY AND NLP Mitchell et al. (2008) first showed that there is a relationship between the co-occurrence patterns of words in text and brain activation for processing the semantics of words. Specifically, they showed that a computational model trained on co-occurrence patterns for a few verbs was able to predict fMRI activations for novel nouns. Since this paper was introduced, many works have attempted to isolate other features that enable prediction and interpretation of brain activity (Frank et al., 2015; Brennan et al., 2016; Lopopolo et al., 2017; Anderson et al., 2017; Pereira et al., 2018; Wang et al., 2020) . Gauthier & Ivanova (2018) however, emphasize that directly optimizing for the decoding of neural representation is limiting, as it does not allow for the uncovering of the mechanisms that underlie these representations. The authors suggest that in order for us to better understand linguistic processing in the brain, we should also aim to train models that optimize for a specific linguistic task and explicitly test these against brain activity. Following this line of work, Toneva & Wehbe (2019) present experiments both predicting brain activity and evaluating representations on a set of linguistic tasks. They first show that using uniform attention in early layers of BERT (Devlin et al., 2018) instead of pretrained attention leads to better prediction of brain activity. They then use the representations of this altered model to make predictions on a range of syntactic probe tasks, which isolate different syntactic phenomena (Marvin & Linzen, 2019), finding improvements against the pretrained BERT attention. Gauthier & Levy (2019) present a series of experiments in which they fine-tune BERT on a variety of tasks including language modeling as well as some custom tasks such as scrambled language modeling and part-of-speechlanguage modeling. They then perform brain decoding, where a linear mapping is learnt from fMRI recordings to the fine-tuned BERT model activations. They find that the best mapping is obtained with the scrambled language modelling fine-tuning. Further analysis using a structural probe method



Figure1: Overview of our approach. We use BERT as a baseline and inject structural bias in two ways. Through a brain decoding task, we then compare the alignment of the (sentence and word) representations of our baseline and our altered models with brain activations.

