DEEPGUISER: LEARNING TO DISGUISE NEURAL AR-CHITECTURES FOR IMPEDING ADVERSARIAL TRANS-FER ATTACKS

Abstract

Security is becoming increasingly critical in deep learning applications. Recent researches demonstrate that NN models are vulnerable to adversarial attacks, which can mislead them with only small input perturbations. Moreover, adversaries who know the architecture of victim models can conduct more effective attacks. Unfortunately, the architectural knowledge can usually be stolen by the adversaries by exploiting the system-level hints through many side channels, which is referred to as the neural architecture extraction attack. Conventional countermeasures for neural architecture extraction can introduce large overhead, and different hardware platforms have diverse types of side-channel leakages such that many expert efforts are needed in developing hardware-specific countermeasures. In this paper, we propose DeepGuiser, an automatic, hardware-agnostic, and retrain-free neural architecture disguising method, to disguise the neural architectures to reduce the harm of neural architecture extraction attacks. In a nutshell, given a trained model, DeepGuiser outputs a deploy model that is functionally equivalent with the trained model but with a different (i.e., disguising) architecture. DeepGuiser can minimize the harm of the follow-up adversarial transfer attacks to the deploy model, even if the disguising architecture is completely stolen by the architecture extraction attack. Experiments demonstrate that DeepGuiser can effectively disguise diverse architectures and impede the adversarial transferability by 13.87% ∼ 32.59%, while only introducing 10% ∼ 40% extra inference latency.

1. INTRODUCTION

Deep neural networks (NNs) have made great success in the field of artificial intelligence (AI) (LeCun et al., 2015) . With NN becoming increasingly complex, a number of NN-specific chips (Jouppi et al., 2017; Liao et al., 2021; Markidis et al., 2018) and intensive innovations (Chen et al., 2020; Qiu et al., 2016; Chen et al., 2014) have been proposed to boost the efficiency of NN computing. Despite the significant progress in hardware performance, security should also be regarded as a higher-priority feature. Especially in safety-critical applications, e.g. autonomous driving, surveillance, and so forth, security vulnerabilities can be exploited by adversaries and lead to uncontrollable consequences. Confidentiality is an essential guarantee for systemic security. The critical confidential information contained in well-trained NN models mainly includes their neural architectures and weight parameters. While the encryption of weight parameters has been well discussed for protecting the weight confidentiality (Orlandi et al., 2007; Cai et al., 2019; Zuo et al., 2021) , the protection of neural architectures is still in lack. Recent researches have alerted that many emerging or even off-the-shelf AI chips are vulnerable to neural architecture extraction attacks (Batina et al., 2018; Hua et al., 2018; Yan et al., 2020; Hu et al., 2020; Wei et al., 2018; Wang et al., 2022) . For example, DeepSniffer (Hu et al., 2020) exploits the system-level hints (e.g. memory access activity, cache miss rate, etc.) of NN processing on GPU platform and proposes a learning-based approach to automatically identify the layer sequences. It also quantitatively shows that the neural architecture extraction can significantly boost the success rate of adversarial transfer attacks by constructing a surrogate model with almost the same neural architecture as the victim model (Demontis et al., 2019; Hu et al., 2020) .

