SAGE: SEMANTIC-AWARE GLOBAL EXPLANATIONS FOR NAMED ENTITY RECOGNITION

Abstract

In the last decades, deep learning approaches achieved impressive results in many research fields, such as Computer Vision and Natural Language Processing (NLP). NLP in particular has greatly benefit from unsupervised methods that allow to learn distributed representation of language. On the race for better performances Language Models have reached hundred of billions parameters nowadays. Despite the remarkable results, deep models are still far from being fully exploited in real world applications. Indeed, these approaches are black-boxes, i.e. they are not interpretable by design nor explainable, which is often crucial to make decisions in business. Several task-agnostic methods have been proposed in literature to explain models' decisions. Most techniques rely on the "local" assumption, i.e. explanations are made example-wise. In this paper instead, we present a post-hoc method to produce highly interpretable global rules to explain NLP classifiers. Rules are extracted with a data mining approach on a semantically enriched input representation, instead of using words/wordpieces solely. Semantic information yields more abstract and general rules that are both more explanatory and less complex, while being also better at reflecting the model behaviour. In the experiments we focus on Named Entity Recognition, an NLP task where explainability is under-investigated. We explain the predictions of BERT NER classifiers trained on two popular benchmarks, CoNLL03 and Ontonotes, and compare our model against LIME (Ribeiro et al., 2016) and Decision Trees.

1. INTRODUCTION

In recent years, Artificial Intelligence (AI) algorithms, especially deep learning models, are emerging in many applications, reporting state-of-the-art performances in many fields. In NLP, for example, the use of Large Language Models (LLM) based on huge deep neural networks achieved impressive results in many linguistic tasks. However, despite the remarkable results, deep approaches are still far from being fully exploited in real world applications. One major issue is the lack of interpretability and control of the models' predictions. This is often an important requirement for many industrial applications, especially in domains like medicine, defense, finance and law, where it is crucial to understand the decisions and build trust in the algorithms. The increasing need to address the problem of interpretability and improve model transparency made the "Explainable Artificial Intelligence" a very popular research area in the Computer Science world. Explainable AI (XAI) or Interpretable AI or Explainable Machine Learning (XML) (Guidotti et al., 2021) is a broad area of research that studies and proposes AI approaches where humans can understand the causes underlying the decisions and predictions made by the machine (Vilone & Longo, 2021b) . The AI algorithms can be usually grouped into two families (Vilone & Longo, 2021a): (a) white-box models, which include algorithms whose interpretation is given by design, and (b) black-box approaches where, on the other hand, the decision making process is "opaque" and hard to understand. White-box models such as linear regression, probabilistic classifiers or decision trees are significantly easier to explain and interpret, but, often, provide a low predictive capacity and they are not always capable of modeling the inherent complexity of the task. In black-box models, on the other hand, very little knowledge is available on how the input variables influence the final decision. The relationship between input and output is often the result of a complex composition of 1

