SARNET: SARCASM VS TRUE-HATE DETECTION NET-WORK

Abstract

At times hate speech detection classifiers miss the context of a sentence and flag a sarcastic tweet incorrectly. To tackle this problem by emphasising on the context of a tweet we propose SarNet. SarNet is a two-fold deep learning based model which follows a quasi-ternary labelling strategy and contextually classifies a tweet as hate, sarcastic or neither. The first module of SarNet is an ANN-BiLSTM based Pyramid Network used to calculate the hate and sarcastic probabilities of a sentence. The second module of the SarNet is the Nash Equalizer which stems from the concept of game theory and prisoner's dilemma. It treats hate and sarcasm as two prisoners. A payoff matrix is constructed to calculate the true hate of the tweet. True hate considers the hate part of a tweet excluding the sarcastic part of the tweet. Thus, this gives a true estimate of the hate content in a tweet thereby decreasing the number of sarcastic tweets being falsely flagged as hate. Our proposed model is trained on state-of-the-art hate speech and sarcasm datasets in the English language. The precision, recall and F1 score of our proposed model is 0.93, 0.84 and 0.88 respectively. Comparison with state-of-the-art architectures demonstrated better performance of SarNet by a significant margin.

1. INTRODUCTION

Social media has the potential to influence the opinion of the masses. People from all around the world interact and share their perspectives. This has facilitated the rapid exchange of ideas. Unfortunately, these social media platforms are being used to disseminate hate speech. Hate speech is described as aggressive or threatening speech that shows prejudice based on ethnicity, religion, sexual orientation or other factors. This raises the need to deploy robust and efficient classifiers to regulate online content. Citizens on social media regularly use sarcasm to convey their emotions in conversation. Sarcasm is an effective means of expressing thoughts indirectly that is not easy to notice. It is described as an incidental technique of expressing a viewpoint in which the written word does not reflect the intended meaning. Most of the hate speech detection algorithms (1; 2) face difficulty in distinguishing sarcastic statements. This is because sarcasm reverses the polarity of a seemingly positive or negative phrase. This attribute of sarcasm leads hate detection classifiers to falsely flag sentences as hate. Thus, it is crucial to reduce the false positives by considering the sarcastic context of a sentence while predicting hate. The labelled datasets suffer from annotators' bias and/or the unavailability of datasets annotated as both hate and sarcasm (5). The above limitations of datasets make it very difficult for deep learning models to predict the actual hate content of a tweet with high precision. Our proposed SarNet model overcomes these problems by employing a semi-supervised learning approach to predict the degree of hate in a sentence that we define as true-hate. Researchers have presented multiple approaches for tackling hate speech in the past decade. Kapil and Ekbal (14) proposed a deep neural network-based multi-task learning approach for hate speech detection. Corazza et al. (4) proposed a neural network classifier for multilingual online hate speech detection in three languages and investigated the impact of each feature on the respective outcomes. While it is critical to identify hate on social media promptly, it is equally important to avoid false positives. Arango et al. (5) point to methodological flaws as well as significant dataset biases while tackling hate speech. (5) stated that many state-of-the-art performance claims had become vastly

