ON THE ROBUSTNESS OF SENTIMENT ANALYSIS FOR STOCK PRICE FORECASTING Anonymous authors Paper under double-blind review

Abstract

Machine learning (ML) models are known to be vulnerable to attacks both at training and test time. Despite the extensive literature on adversarial ML, prior efforts focus primarily on applications of computer vision to object recognition or sentiment analysis to movie reviews. In these settings, the incentives for adversaries to manipulate the model's prediction are often unclear and attacks require extensive control of direct inputs to the model. This makes it difficult to evaluate how severe the impact of vulnerabilities exposed is on systems deploying ML with little provenance guarantees for the input data. In this paper, we study adversarial ML with stock price forecasting. Adversarial incentives are clear and may be quantified experimentally through a simulated portfolio. We replicate an industry standard pipeline, which performs a sentiment analysis of Twitter data to forecast trends in stock prices. We show that an adversary can exploit the lack of provenance to indirectly use tweets to manipulate the model's perceived sentiment about a target company and in turn force the model to forecast price erroneously. Our attack is mounted at test time and does not modify the training data. Given past market anomalies, we conclude with a series of recommendations for the use of machine learning as input signal to trading algorithms.

1. INTRODUCTION

Research on the vulnerability of machine learning (ML) to adversarial examples (Biggio et al., 2013; Szegedy et al., 2013) focused, with few exceptions (Kurakin et al., 2016; Brown et al., 2017) , on adversaries with immediate control over the inputs to an ML model. Yet, ML systems are often applied on large corpora of data collected from sources only partially under the control of adversaries. Recent advances in language modelling (Devlin et al., 2019; Brown et al., 2020) illustrate this well: they rely on training large architectures on unstructured corpora of text crawled from the Internet. This raises a natural question: when the provenance of train or test inputs to ML systems is ill defined, does this advantage model developers or adversaries? Here, by provenance we refer to a detailed history of the flow of information into a computer system (Muniswamy-Reddy et al.). We study the example of such an ML system for stock price forecasting. In this application, ML predictions can both serve as inputs to algorithmic trading or to assist human traders. We choose the example of stock price forecasting because it involves several structured applications of ML, including sentiment analysis over a spectrum of public information sources (e.g., news, Twitter, etc.) with little provenance guarantees. There is also a long history of leveraging knowledge inaccessible to all market participants to gain an edge in predicting the prices of securities. Thales used his knowledge of astronomy to corner the market in olive-oil presses and generate a profit. We first reproduce an ML pipeline for stock price prediction, inspired by practices common in the industry. We note that choosing the right time scale is of paramount importance. ML is better suited for low frequency intra-day and weekly trading than high-frequency trading, because the latter requires decision speeds much faster than achievable by ML hardware accelerators. Although there has been prior work on attacking ML for high-frequency trading (Goldblum et al., 2020) , their experimental setting is 7 orders of magnitude slower than NASDAQ timestamps (NASDAQ; 2020), which high-frequency trading firms use. In contrast, ML in low-frequency trading has attracted greater practical interest of the industry, with two major finance data vendors vastly expanding their sentiment data API offerings in the past decade (Bloomberg, 2017; Reuters, 2014) . This is to serve

