MODEL OBFUSCATION FOR SECURING DEPLOYED NEURAL NETWORKS

Abstract

More and more edge devices and mobile apps are leveraging deep learning (DL) capabilities. Deploying such models on devices -referred to as on-device models -rather than as remote cloud-hosted services, has gained popularity as it avoids transmitting user's data off of the device and for high response time. However, on-device models can be easily attacked, as they can be accessed by unpacking corresponding apps and the model is fully exposed to attackers. Recent studies show that adversaries can easily generate white-box-like attacks for an on-device model or even inverse its training data. To protect on-device models from whitebox attacks, we propose a novel technique called model obfuscation. Specifically, model obfuscation hides and obfuscates the key information -structure, parameters and attributes -of models by renaming, parameter encapsulation, neural structure obfuscation, shortcut injection, and extra layer injection. We have developed a prototype tool ModelObfuscator to automatically obfuscate on-device TFLite models. Our experiments show that this proposed approach can dramatically improve model security by significantly increasing the difficulty of extracting models' inner information, without increasing the latency of DL models. Our proposed on-device model obfuscation has the potential to be a fundamental technique for on-device model deployment. Our prototype tool is publicly available at https://github.com/AnonymousAuthor000/Code2536.

1. INTRODUCTION

Numerous edge and mobile devices are leveraging deep learning (DL) capabilities. Though DL models can be deployed on a cloud platform, data transmission between mobile devices and the cloud may compromise user privacy and suffer from severe latency and throughput issues. To achieve high-level security, users' personal data should not be sent outside the device. To achieve high throughput and short response time, especially for a large number of devices, on-device DL models are needed. The capabilities of newer mobile devices and some edge devices keep increasing, with more powerful systems on a chip (SoCs) and a large amount of memory, making them suitable for running on-device models. Indeed, many intelligent applications have already been deployed on devices (Xu et al., 2019) and benefited millions of users. Unfortunately, it has been shown that on-device DL models can be easily extracted. Then, the extracted model can be used to produce many kinds of attacks, such as adversarial attacks, membership inference attacks, model inversion attacks, etc. (Szegedy et al., 2013; Chen et al., 2017b; Shokri et al., 2017; Fang et al., 2020) . The deployed DL model can be extracted by three kinds of attacks: (1) extracting the model's weights through queries (Tramèr et al., 2016) . ( 2) extracting the entire model from devices using software analysis (Vallée-Rai et al., 2010) or reverse engineering (Li et al., 2021b) . (3) extracting the model's architecture by side-channel attacks (Li et al., 2021a) . According to our observation, existing defense methods can be categorized into two different levels: (1) algorithm level, and (2) side-channel level. For securing the AI model at the algorithm level, some studies (Orekondy et al., 2019b; Kariyappa & Qureshi, 2020; Mazeika et al., 2022) propose methods to degenerate the effectiveness of query-based model extraction. While other studies (Xu et al., 2018; Szentannai et al., 2019; 2020) propose methods to train a simulating model, which has similar performance to the original model but is more resilient to extraction attacks. For securing

