A SIAMESE NEURAL NETWORK FOR BEHAVIORAL BIOMETRICS AUTHENTICATION

Abstract

The raise in popularity of web and mobile applications brings about a need of robust authentication systems. Although password authentication is the most popular authentication mechanism, it has several drawbacks. Behavioral Biometrics Authentication has emerged as a complementary risk-based authentication approach which aims at profiling users based on their interaction with computers/smartphones. In this work we propose a novel Siamese Neural Network to perform a few-shot verification of user's behavior. We develop our approach to authenticate either human-computer or human-smartphone interaction. For computer interaction our approach learns from mouse and keyboard dynamics, while for smartphone it learns from holding patterns and touch patterns. We show that our approach has a few-shot classification accuracy of up to 99.8% and 90.8% for mobile and web interactions, respectively. We also test our approach on a database that contains over 100K different web interactions collected in the wild.

1. INTRODUCTION

Biometric authentication has emerged as a complement to traditional authentication systems. The main advantage of such systems is that they rely on user's information that can not easily be stolen or crafted. Most active fields of biometric authentication in academia and industry are related to face authentication or fingerprint authentication, with a recent increase in interest on behavioral biometrics. Behavioral Biometrics authentication refers to the use of human-device interaction features to grant access to a specific service. This interaction could include, but is not limited to, typing patterns, mouse dynamics, smartphone holding patterns, voice recognition, gait recognition etc. Machine learning algorithms have been proposed to verify users identity using behavioral biometrics features. Regarding behavioral biometrics in web environments (Human-Computer interaction), most of the work has focused on the use of Support Vector Machine and Random Forest classifiers to analyze mouse and keyboard interaction (Khan et al., 2018; Solano et al., 2020) . Alternatively, some works have proposed to use built-in sensors available in mobile devices (i.e sensors information, touch interaction etc.) for authentication purposes (Rauen et al., 2018; Rocha et al., 2019; Zhang et al., 2016; Amini et al., 2018; Abuhamad et al., 2020) . However, previous works in behavioral biometrics usually have three main drawbacks: (1) they need long interactions (minutes) in order to learn accurately the user behavior; (2) they require ad-hoc interaction challenges; or (3) they need a model per user to improve model accuracy. In this paper, we present a Siamese One-Shot Neural Network (SOS-NN) which is able to assess a risk score after only one observation (i.e. enrollment behavior) of a given user. To achieve this, we propose a Siamese Neural Network architecture that assesses whether two behaviors belong to the same user. We present a similar architecture to user verification for both, web and mobile environments. In web environments, we create a set of features from raw mouse movements and keyboard strokes. On the other hand, for the mobile environment our SOS-NN analyses features created from touch interaction and motion sensors on the smartphone. In sum, the contributions of our work are: (1) An approach to user authentication using behavioral biometrics information in an accurate few-shot learning fashion after only 5 seconds of user interaction; (2) A unified neural network architecture to authenticate user's behavior for both mobile and web environments that is able to achieve an accuracy of up to 99.8% and 90.8% respectively; (3) A framework which is able to accurately authenticate users in a large scale without requiring to retrain the model for new users; (4) A systematic measurement study to understand the impact of the parameters to SOS-NN based on the authentication time window length and the n number in the n-shot test; and (5) A comprehensive in the wild evaluation of our approach in web environments which test SOS-NN over thousands of users from real financial services.

2. BACKGROUND

In the field of biometric-based authentication, many sources of information have been proposed. These could be physical (facial recognition, fingerprint scanning, retina scan, etc) or behavioral patterns (signature verification, mouse dynamics, gait analysis, voice recognition, etc.). Behavioral biometrics has gained increased attention since reproducing the behavior of a legitimate user constitutes an unconventional challenge for attackers. Moreover, many behavioral biometric methods do not require specialized hardware, which makes a scalable deployment easier. While substantial advances have been achieved, there is still a gap to make such systems widely adopted in practice. We define as 'practical' methods that demand short periods of interaction per user (both for model training and for authentication), and simple architectures that ease deployment and maintenance. This paper proposes a novel approach, which complements traditional authentication in web and mobile environments, considering practical implementation characteristics and scalability constraints. Web Environment. Multiple previous studies have employed multimodal biometrics in desktop environments to identify user's behaviour (Traore et al., 2012; Bailey et al., 2014; Fridman et al., 2015; Neha & Chatterjee, 2019; Solano et al., 2020) . These studies propose to integrate, either at the feature or decision level, information from keyboard interaction, mouse dynamics and others. Most of these studies have evaluated classic ML classification models (e.g. SVM, Naive Bayes, Random forest and J48 algorithms). Few others (Jagadeesan & Hsiao, 2009; Khan et al., 2018) , have explored the use of shallow neural networks. Remarkably, we found that siamese neural architectures had not been studied before in this field. Therefore, in this study, we explore the effectiveness, generalization and applicability of such architectures for behavioral biometrics. Regarding user interaction, we highlight that, in multiple real-world applications there is a practical limit to the length of the interaction and amount of data that can be collected before deploying an authentication model, particularly when enrolling new users to the system. Therefore, we compared previous approaches on the amount of interaction required, per user, to train the model. We are comparable to few studies (4), that require between 2min. to approximately 30min. of user interaction to train the model. As an illustration, Khan et al. (2018) reported an accuracy of 97.3% using an SVM model per user, however, their approach would require recording at least 30 previous login attempts (≈15 minutes) to train each user's model. A more recent approach proposed by Neha & Chatterjee (2019) achieves an accuracy of 95.6% after training a MLP for each user but they required 50 logins (≈ 25 minutes) for training phase. On the scalability perspective, previous studies use authentication paradigms that involve one model per user or multiclass classification, these methods translate into large infrastructure, deployment, monitoring and maintenance challenges. In contrast, our SOS-NN model generates a measure of similarity between two behaviors in a latent feature space. In this case, the question is not whether the sample belongs to a particular user, but rather if the samples are similar enough to conclude that the user is the same. This one-model-for-all (OfA) paradigm facilitates deployment and avoids further training for every new user registered in the system. To the best of our knowledge the SOS-NN model and OfA paradigm have not been applied in desktop/web behavioral biometrics before. On Appendix A, we compare with further detail, previous approaches with respect to the classification methods used, the authentication paradigm and user interaction required. Mobile Environment. Likewise, behavioral biometrics for authentication has also been implemented for mobile environments. Such models complement traditional authentication by taking advantage of the multiple built-in sensors available in mobile devices, being able to capture user behavior through several modalities. Some modalities rely on the use of mobile keyboard dynamics (Cilia & Inguanez, 2018) ; touchscreen interaction (Rauen et al., 2018; Rocha et al., 2019) ; or embedded motion sensors data (Abuhamad et al., 2020) to authenticate users. In order to strengthen security, especially against ad-hoc adversarial attacks, multimodal authentication frameworks have been proposed by researchers (Stanciu et al., 2016; Sitová et al., 2015; Lamiche et al., 2019; Acien 

