DEEPGUISER: LEARNING TO DISGUISE NEURAL AR-CHITECTURES FOR IMPEDING ADVERSARIAL TRANS-FER ATTACKS

Abstract

Security is becoming increasingly critical in deep learning applications. Recent researches demonstrate that NN models are vulnerable to adversarial attacks, which can mislead them with only small input perturbations. Moreover, adversaries who know the architecture of victim models can conduct more effective attacks. Unfortunately, the architectural knowledge can usually be stolen by the adversaries by exploiting the system-level hints through many side channels, which is referred to as the neural architecture extraction attack. Conventional countermeasures for neural architecture extraction can introduce large overhead, and different hardware platforms have diverse types of side-channel leakages such that many expert efforts are needed in developing hardware-specific countermeasures. In this paper, we propose DeepGuiser, an automatic, hardware-agnostic, and retrain-free neural architecture disguising method, to disguise the neural architectures to reduce the harm of neural architecture extraction attacks. In a nutshell, given a trained model, DeepGuiser outputs a deploy model that is functionally equivalent with the trained model but with a different (i.e., disguising) architecture. DeepGuiser can minimize the harm of the follow-up adversarial transfer attacks to the deploy model, even if the disguising architecture is completely stolen by the architecture extraction attack. Experiments demonstrate that DeepGuiser can effectively disguise diverse architectures and impede the adversarial transferability by 13.87% ∼ 32.59%, while only introducing 10% ∼ 40% extra inference latency.

1. INTRODUCTION

Deep neural networks (NNs) have made great success in the field of artificial intelligence (AI) (LeCun et al., 2015) . With NN becoming increasingly complex, a number of NN-specific chips (Jouppi et al., 2017; Liao et al., 2021; Markidis et al., 2018) and intensive innovations (Chen et al., 2020; Qiu et al., 2016; Chen et al., 2014) have been proposed to boost the efficiency of NN computing. Despite the significant progress in hardware performance, security should also be regarded as a higher-priority feature. Especially in safety-critical applications, e.g. autonomous driving, surveillance, and so forth, security vulnerabilities can be exploited by adversaries and lead to uncontrollable consequences. Confidentiality is an essential guarantee for systemic security. The critical confidential information contained in well-trained NN models mainly includes their neural architectures and weight parameters. While the encryption of weight parameters has been well discussed for protecting the weight confidentiality (Orlandi et al., 2007; Cai et al., 2019; Zuo et al., 2021) , the protection of neural architectures is still in lack. Recent researches have alerted that many emerging or even off-the-shelf AI chips are vulnerable to neural architecture extraction attacks (Batina et al., 2018; Hua et al., 2018; Yan et al., 2020; Hu et al., 2020; Wei et al., 2018; Wang et al., 2022) . For example, DeepSniffer (Hu et al., 2020) exploits the system-level hints (e.g. memory access activity, cache miss rate, etc.) of NN processing on GPU platform and proposes a learning-based approach to automatically identify the layer sequences. It also quantitatively shows that the neural architecture extraction can significantly boost the success rate of adversarial transfer attacks by constructing a surrogate model with almost the same neural architecture as the victim model (Demontis et al., 2019; Hu et al., 2020) . The high risk rendered by neural architecture extraction attacks necessitates the protection of neural architectures. On one hand, from the view of intellectual property protection, neural architectures are usually manually designed by experts (He et al., 2016; Simonyan & Zisserman, 2014; Sandler et al., 2018) or automatically designed by neural architecture search (NAS) (Cai et al., 2018; Liu et al., 2018b; Tan et al., 2019) , both of which consume significant labor and resources. On the other hand, from the view of adversarial robustness, if the architecture of the deploy model is leaked, the adversaries can train a surrogate model with the same architecture and use it to craft much more effective adversarial examples to attack the deploy model by exploiting the high transferability between the surrogate and deploy model (Hu et al., 2020) . It is hard to design a universal protection scheme against neural architecture extraction attacks for different kinds of AI chips at the system or hardware level, as various design options affect the hardware characteristics. Diverse run-time side-channel information can be exploited to extract the neural architectures on different hardware platforms, e.g. power (Wei et al., 2018) , cache activity (Yan et al., 2020 ), memory access (Hua et al., 2018) , etc. And blocking all these side-channel leakages by designing hardware-specific countermeasures might consume huge system costs and expert efforts. In this work, we propose an "architecture disguising" solution at the algorithm level, DeepGuiser, which protects the architecture information by disguising it before deployment and alleviates the security risk rendered from the architecture extraction and the follow-up adversarial transfer attacks. Fig. 1 (Left) illustrates the attack scenario we are concerned about, and Fig. 1 (Right) demonstrates how DeepGuiser plays its role. And as there exist diverse models to be deployed and the disguising space (introduced in Sec. 4.1) is extremely large, manually finding a good disguising architecture for every possible model is extremely costly and even impossible. Therefore, we design DeepGuiser to automatically and efficiently yield a good disguising architecture for a given trained model. We summarize our contributions as follows: • DeepGuiser is an automatic, hardware-agnostic, and retrain-free neural architecture disguising framework. As shown in Fig. 1 (Right), given a trained model, the disguising policy in DeepGuiser takes the original architecture as the input, and outputs a disguising architecture. Then, with functionality-preserving weight transforms, DeepGuiser yields a "deploy model" that is functionally equivalent with the trained model but with the disguising architecture. This deploy model is deployed, and even if its architecture is stolen, the harm of the follow-up adversarial transfer attacks to the deploy model can be largely reduced.



Figure 1: (Left) 1: The adversary can snoop the system to extract the architecture of the deployed model. 2&3: The architecture information can be utilized to train a surrogate with high transferability and then craft effective adversarial examples to attack the trained model. (Right) DeepGuiser disguises the trained model to a functionally equivalent deploy model with a disguising architecture. Then, this deploy model is deployed onto the chip. Even if the adversary extracts the disguising architecture through snooping and trains the surrogate model, the adversarial examples crafted using the surrogate have low transferability to the original trained model and also the actual deployed model.

