RADIAL SPIKE AND SLAB BAYESIAN NEURAL NET-WORKS FOR SPARSE DATA IN RANSOMWARE ATTACKS

Abstract

Paper under double-blind review Ransomware attacks are increasing at an alarming rate, leading to large financial losses, unrecoverable encrypted data, data leakage, and privacy concerns. The prompt detection of ransomware attacks is required to minimize further damage, particularly during the encryption stage. However, the frequency and structure of the observed ransomware attack data makes this task difficult to accomplish in practice. The data corresponding to ransomware attacks represents temporal, highdimensional sparse signals, with limited records and very imbalanced classes. While traditional deep learning models have been able to achieve state-of-the-art results in a wide variety of domains, Bayesian Neural Networks, which are a class of probabilistic models, are better suited to the issues of the ransomware data. These models combine ideas from Bayesian statistics with the rich expressive power of neural networks. In this paper, we propose the Radial Spike and Slab Bayesian Neural Network, which is a new type of Bayesian Neural network that includes a new form of the approximate posterior distribution. The model scales well to large architectures and recovers the sparse structure of target functions. We provide a theoretical justification for using this type of distribution, as well as a computationally efficient method to perform variational inference. We demonstrate the performance of our model on a real dataset of ransomware attacks and show improvement over a large number of baselines, including state-of-the-art models such as Neural ODEs (ordinary differential equations). In addition, we propose to represent low-level events as MITRE ATT&CK tactics, techniques, and procedures (TTPs) which allows the model to better generalize to unseen ransomware attacks.

1. INTRODUCTION

Ransomware attacks are increasing rapidly and causing significant losses to governments, corporations, non-governmental organizations, and individuals. The losses may include financial costs due to ransoms paid to decrypt assets, unrecoverable files when the ransom is not paid or the attacker fails to provide the decryption key, privacy and intellectual property theft when assets are exported, and even significant injury when ransomware impairs health care devices or patient records in hospitals. It is clear that the timely detection of ransomware incidents is necessary in order to minimize the number of assets that are encrypted or exfiltrated (Urooj et al., 2021) . To improve the ransomware response, this work proposes a new Bayesian Neural Network model that offers improved detection rates for organizations which employ analysts to protect their assets and networks. The problem is usually considered as a detection task, where the two classes are ransomware or not. The traditional methods of statistics and machine learning have been proposed to detect security threats in general and specifically ransomware in some cases. From the statistical perspective, a common approach is the application of Bayesian Networks (Perusquía et al., 2020; Oyen et al., 2016; Shin et al., 2015) , whose main goal is to model the relationship between the observed signal and the class of the attack as a graphical model. From the machine learning perspective, a range of models were used to detect ransomware (Alhawi et al., 2018; Poudyal et al., 2018; Zhang et al., 2019; Larsen et al., 2021) , such as Naive Bayes, Gradient Boosting, and Random Forests. Bottleneck. To obtain the rich expressive power of traditional deep learning models, training usually requires having access to a large number of records to successfully obtain robust generalized results. Unfortunately, the frequency and structure of commonly observed data corresponding to ransomware attacks makes this task more difficult to accomplish. In particular, ransomware attack data can be represented as temporal high-dimensional sparse signals, with a limited number of records and very imbalanced classes. In our data, the percentage of ransomware attacks to non-ransomware attacks is 1% versus 99%, respectively.

