A MULTI-GRAINED SELF-INTERPRETABLE SYMBOLIC-NEURAL MODEL FOR SINGLE/MULTI-LABELED TEXT CLASSIFICATION

Abstract

Deep neural networks based on layer-stacking architectures have historically suffered from poor inherent interpretability. Meanwhile, symbolic probabilistic models function with clear interpretability, but how to combine them with neural networks to enhance their performance remains to be explored. In this paper, we try to marry these two systems for text classification via a structured language model. We propose a Symbolic-Neural model that can learn to explicitly predict class labels of text spans from a constituency tree without requiring any access to spanlevel gold labels. As the structured language model learns to predict constituency trees in a self-supervised manner, only raw texts and sentence-level labels are required as training data, which makes it essentially a general constituent-level self-interpretable classification model. Our experiments demonstrate that our approach could achieve good prediction accuracy in downstream tasks. Meanwhile, the predicted span labels are consistent with human rationales to a certain degree.

1. INTRODUCTION

Lack of interpretability is an intrinsic problem in deep neural networks based on layer-stacking for text classification. Many methods have been proposed to provide posthoc explanations for neural networks (Lipton, 2018; Lundberg & Lee, 2017; Sundararajan et al., 2017) . However, these methods have multiple drawbacks. First, there is only word-level attribution but no high-level attribution such as those over phrases and clauses. Take sentiment analysis as an example, in addition to the ability to recognize the sentiment of sentences, an ideal interpretable model should be able to identify the sentiment and polarity reversal at the levels of words, phrases, and clauses. Secondly, as argued by Rudin (2019), models should be inherently interpretable rather than explained by a posthoc model. A widely accepted property of natural languages is that "the meaning of a whole is a function of the meanings of the parts and of the way they are syntactically combined" (Partee, 1995). Compared with the sequential outputs of layer-stacked model architectures, syntactic tree structures naturally capture features of various levels because each node in a tree represents a constituent span. Such a characteristic motivates us to think about whether the representations of these internal nodes could be leveraged to design an inherently constituent-level interpretable model. One challenge faced by this idea is that traditional syntactic parsers require supervised training and have degraded performance on out-of-domain data. Fortunately, with the development of structured language models (Tu et al., 2013; Maillard et al., 2017; Choi et al., 2018; Kim et al., 2019) , we are now able to learn hierarchical syntactic structures in an unsupervised manner from any raw text. In this paper, we propose a general selfinterpretable text classification model that can learn to predict span-level labels unsupervisedly as shown in Figure 1 . Specifically, we propose a novel label extraction framework based on a simple inductive bias for inference. During training, we maximize the probability summation of all potential trees whose extracted labels are consistent with a gold label set via dynamic programming with linear complexity. By using a structured language model as the backbone, we are able to leverage the internal representations of constituent spans as symbolic interfaces, based on which we build transition functions for the dynamic programming algorithm. The main contribution of this work is that we propose a Symbolic-Neural model, a simple but general model architecture for text classification, which has three advantages: 1. Our model has both competitive prediction accuracy and self-interpretability, whose rationales are explicitly reflected on the label probabilities of each constituent. 2. Our model can learn to predict span-level labels without requiring any access to span-level gold labels. 3. It handles both single-label and multi-label text classification tasks in a unified way instead of transferring the latter ones into binary classification problems (Read et al., 2011) in conventional methods. To the best of our knowledge, we are the first to propose a general constituent-level self-interpretable classification model with good performance on downstream task performance. Our experiment shows that the span-level attribution is consistent with human rationales to a certain extent. We argue such characteristics of our model could be valuable in various application scenarios like data mining, NLU systems, prediction explanation, etc, and we discuss some of them in our experiments.

2.1. ESSENTIAL PROPERTIES OF STRUCTURED LANGUAGE MODELS

Structured language models feature combining the powerful representation of neural networks with syntax structures. Though many attempts have been made about structured language models (Kim et al., 2019; Drozdov et al., 2019; Shen et al., 2021) , three prerequisites need to be met before a model is selected as the backbone of our method. Firstly, it should have the ability to learn reasonable syntax structure in an unsupervised manner. Secondly, it computes an intermediate representation for each constituency node. Thirdly, it has a pretraining mechanism to improve representation performance. Since Fast-R2D2 (Hu et al., 2022; 2021 ) satisfies all the above conditions and also has good inference speed, we choose Fast-R2D2 as our backbone.

2.2. FAST-R2D2

Overall, Fast-R2D2 is a type of structured language model that takes raw texts as input and outputs corresponding binary parsing trees along with node representations as shown in Figure 3 (a). The representation e i,j representing a text span from the i th to the j th word is computed recursively from its child node representations via a shared composition function, i.e., e i,j = f (e i,k , e k+1,j ), where k is the split point given by the parser and f (•) is an n-layered Transformer encoder. When i = j, e i,j is initialized as the embedding of the corresponding input token. Please note the parser is trained in a self-supervised manner, so no human-annotated parsing trees are required.

3. SYMBOLIC-NEURAL MODEL 3.1 MODEL

There are two basic components in the Symbolic-Neural model: 1. A Structured LM backbone which is used to parse a sentence to a binary tree with node representations. 2. An MLP which is used to estimate the label distribution from the node representation. For Structured LMs that follow a bottom-up hierarchical encoding process (such as our default LM Fast-R2D2), context outside a span is invisible to the span, which may make low-level short spans unable to predict correct labels because of a lack of information. So we introduce an optional module to allow information to flow in parse trees from top to down. The overall idea is to construct a top-down process to fuse information from both inside and outside of spans. For a given span (i, j), we denote the top-down representation as e ′ i,j . We use the Transformer as the top-down encoder function f ′ . The top-down encoding process starts from the root and functions recursively on the child nodes. For the root node, we have [•, e ′ 1,n ] = f ′ ([e root , e 1,n ])



Figure 1: Our model can learn to predict span-level labels without access to span-level gold labels during training. In examples (a) and (b), only raw texts and sentence-level gold labels {request address, navigate} and {negative} are given.

