ASPECT-BASED SENTIMENT CLASSIFICATION VIA REINFORCEMENT LEARNING

Abstract

Aspect-based sentiment classification aims to predict sentimental polarities of one or multiple aspects in texts. As texts always contain a large proportion of taskirrelevant words, accurate alignment between aspects and their sentimental descriptions is the most crucial and challenging step. State-of-the-art approaches are mainly based on word-level attention learned from recurrent neural network variants (e.g., LSTM) or graph neural networks. From another view, these methods essentially weight and aggregate all possible alignments. However, this mechanism heavily relies on large-scale supervision training: without enough labels, it could easily overfit with difficulty in generalization. To address this challenge, we propose SentRL, a reinforcement learning-based framework for aspect-based sentiment classification. In this framework, input texts are transformed into their dependency graphs. Then, an agent is deployed to walk on the graphs, explores paths from target aspect nodes to their potential sentimental regions, and differentiates the effectiveness of different paths. By limiting the agent's exploration budget, our method encourages the agent to skip task-irrelevant information and focus on the most effective paths for alignment purpose. Our method considerably reduces the impact of task-irrelevant words and improves generalization performance. Compared with competitive baseline methods, our approach achieves the highest performance on public benchmark datasets with up to 3.7% improvement.

1. INTRODUCTION

The goal of aspect-based (also known as aspect-level) sentiment classification is to predict the sentiment polarities of individual aspects. As shown in Figure 1 , given a sentence "I like this computer but do not like the screen", the sentiment of the aspect "computer" is positive because of "like". Meanwhile, the sentiment of the aspect "screen" is negative for "do not like". Aspect-based sentiment classification is challenging, where the core problem is to correctly align aspects with their sentiment descriptions. State-of-the-art methods rely on supervision signals to automatically learn such alignment. By leveraging textual context and word-level attention learned from deep models (Vo & Zhang, 2015; Dong et al., 2014; Bahdanau et al., 2014; Luong et al., 2015; Xu et al., 2015; Wang et al., 2016; Tang et al., 2016b; Ma et al., 2017a; He et al., 2018; Zhang et al., 2018; 2019; Gao et al., 2019; Tang et al., 2020) , existing methods have made great progress on discovering aspect-specific sentimental statements. Meanwhile, the existing methods could suffer serious overfitting problems, as natural language inevitably includes a large proportion of task-irrelevant texts, or noise from the perspective of machine learning. Ideally, with a sufficient amount of training labels, the existing methods could effectively contain the negative impact of such task-irrelevant information. In practice, because of the high variance in language expression, it is costly to collect a large number of task-specific labels, and it is difficult to guarantee the expected label sufficiency. With limited labels, the existing approaches could easily include task-irrelevant information into decision processes, overfit training data, and end up with inferior generalization performance to unseen data. To effectively reduce the impact of task-irrelevant information, we propose SentRL, a reinforcement learning based framework for aspect-level sentiment classification. In our approach, input texts are firstly transformed into graph objects (e.g., dependency graphs (Covington, 2001) ), where nodes are words and edges indicate syntactic dependencies/relations between them. Next, we deploy a Green and red are "positive" and "negative" sentiment respectively. The dependency graph effectively reduces the distances between aspects and sentimental descriptions and avoids polysemy words (e.g., "like"). In our approach, an agent is deployed to walk from the aspect word to the sentimental regions, which avoids task-irrelevant information and achieves more effective and efficient performance. policy-based agent to discover aspect-related sentiment descriptions in the graphs. This agent is geared with a language understanding module so that it is able to update exploration states and make sentiment decisions for individual aspects. Unlike existing methods that aggregate potential sentiment information from all possible textual contexts or words, our agent strives to leverage the most relevant exploration paths under a limited budget. This strategy not only requires the agent to focus on the most effective paths but also encourages the agent to skip task-irrelevant regions. Using standard back-propagation methods, the policy network and the language understanding module are jointly trained. From public benchmark datasets, we observe our method could achieve up to 3.6% improvement compared with competitive state-of-the-art methods. The main contributions of our work are listed below. • A novel reinforcement learning framework for aspect-based sentiment classification is proposed. It accurately pinpoints the most effective path between sentiment descriptions and the target aspects, and effectively avoids the impact of the task-irrelevant regions. • A policy network is developed to provide an agent with exploration guidance. This network iteratively provides suggestions on next-hop selection. In particular, the framework is permutation invariant and guarantees the consistency and reliability of the model. • A language understanding module is developed to help an agent "remember" its exploration history and make the final sentiment prediction. • Extensive experiments on representative benchmark datasets are evaluated. The results demonstrate the effectiveness, efficiency, and robustness of our approach.

2.1. ASPECT-BASED SENTIMENT CLASSIFICATIONS

Aspect-based sentiment classification is to identify sentiment polarities of one or more aspects in given texts (Thet et al., 2010) . Aspects could be either substantial objects (e.g., computer and car) or conceptional objects (e.g., service and atmosphere). There are usually three sentiment categories including positive, neutral, and negative, while more sophisticated categories could be explored. Conventional approaches (Kiritchenko et al., 2014) treat input texts as word sequences, and deploy separate feature extraction modules as well as classification modules. Deep learning-based sentiment analysis methods (Tang et al., 2016a) take contextual information regarding the word order into consideration by using LSTM (Hochreiter & Schmidhuber, 1997; Liu et al., 2018) . Attentionbased approaches are proposed (Tang et al., 2016c; Bahdanau et al., 2014; Luong et al., 2015; Xu et al., 2015; Wang et al., 2016; Tang et al., 2016b; Ma et al., 2017a; Huang & Carley, 2019; Ma et al., 2017b; Huang et al., 2018; Li et al., 2018) to improve the effectiveness of contextual feature extraction. While such approaches utilize sequential models and attention mechanisms to learn features from word sequences, they could require a large amount of training labels to be well generalized for natural language expressions with non-trivial variance (e.g., long sentences with majority of irrelevant contextual words).



Figure1: Dependency graph of a given sentence. Blue words are the aspects. Green and red are "positive" and "negative" sentiment respectively. The dependency graph effectively reduces the distances between aspects and sentimental descriptions and avoids polysemy words (e.g., "like"). In our approach, an agent is deployed to walk from the aspect word to the sentimental regions, which avoids task-irrelevant information and achieves more effective and efficient performance.

