HOLISTIC ADVERSARIALLY ROBUST PRUNING

Abstract

Neural networks can be drastically shrunk in size by removing redundant parameters. While crucial for the deployment on resource-constraint hardware, oftentimes, compression comes with a severe drop in accuracy and lack of adversarial robustness. Despite recent advances, counteracting both aspects has only succeeded for moderate compression rates so far. We propose a novel method, HARP, that copes with aggressive pruning significantly better than prior work. For this, we consider the network holistically. We learn a global compression strategy that optimizes how many parameters (compression rate) and which parameters (scoring connections) to prune specific to each layer individually. Our method fine-tunes an existing model with dynamic regularization, that follows a step-wise incremental function balancing the different objectives. It starts by favoring robustness before shifting focus on reaching the target compression rate and only then handles the objectives equally. The learned compression strategies allow us to maintain the pre-trained model's natural accuracy and its adversarial robustness for a reduction by 99 % of the network's original size. Moreover, we observe a crucial influence of non-uniform compression across layers. The implementation of HARP is publicly available at https://intellisec.de/research/harp.

1. INTRODUCTION

Deep neural networks (DNNs) yield remarkable performances in classification tasks in various domains (He et al., 2016; Cakir & Dogdu, 2018; Schroff et al., 2015; Li et al., 2022) but are vulnerable to input manipulation attacks such as adversarial examples (Szegedy et al., 2014) . Small perturbations to benign inputs can cause worst-case errors in prediction. To date, the most promising defensive approach against this sort of attacks is adversarial training as introduced by Madry et al. (2018) and further refined ever since (Shafahi et al., 2019; Zhang et al., 2019; Wang et al., 2020) . It introduces adversarial examples into the training process, diminishing the generalization gap between natural performance and adversarial robustness. However, there is evidence indicating that higher robustness requires over-parameterized networks that have wider layers and higher structural complexity (Madry et al., 2018; Zhang et al., 2019; Wu et al., 2021) , rendering the task of combining both objectives-compression and robustness-inherently difficult. Neural network pruning (Han et al., 2016; Yang et al., 2017; He et al., 2017) , for instance, has been proven to be an extraordinary valuable tool for compressing neural networks. The model can be reduced to a fraction of its size by removing redundancy at different structural granularity (Li et al., 2017; Mao et al., 2017; Kollek et al., 2021) . However, pruning inflicts a certain recession in model accuracy (Han et al., 2016) and adversarial robustness (Timpl et al., 2021) that is unavoidable the stronger the model is compressed. The aim of adversarial robust pruning hence is to maintain the accuracy and robustness of an already adversarially pre-trained model as good as possible. Despite great efforts (Ye et al., 2019; Sehwag et al., 2020; Madaan et al., 2020; Özdenizci & Legenstein, 2021; Lee et al., 2022) , this dual objective has only been achieved for moderate compression so far. In this paper, we start from the hypothesis that effective adversarially robust pruning requires a nonuniform compression strategy with learnable pruning masks, and propose our method HARP. We follow the three-stage pruning pipeline proposed by Han et al. (2015) to improve upon pre-trained models, where we jointly optimize score-based pruning masks and layer-wise compression rates during fine-tuning. As high robustness challenges the compactness objective (Timpl et al., 2021) , we employ a step-wise increasing weighted-control of the number of weights to be pruned, such that we can learn masks and rates simultaneously. Our approach explores a global pruning strategy that allows for on-par natural accuracy with little robustness degradation only. 7 ). Dashed lines represent the robustness against AUTOATTACK. In summary, we make the following contributions: • Novel pruning technique for pre-trained models. We optimize how many parameters and which parameters to prune for each layer individually, resulting in a global but non-uniform pruning strategy. That is, the overall network is reduced by a predetermined rate governed by the target hardware's limits, but layers are compressed varyingly strongly. We show that both aspects are needed for HARP to take full effect (cf. Section 4.1). • Significant improvement over related work. An overview of our method's performance is presented in Fig. 1 

2. NEURAL NETWORK PRUNING

Removing redundant parameters reduces a network's overall memory footprint and necessary computations, allowing for a demand-oriented adaptation of neural networks to resource-constrained environments (Han et al., 2015; 2016; Wen et al., 2016; Huang et al., 2018; He et al., 2018; 2017; Li et al., 2017; Mao et al., 2017; Molchanov et al., 2017) . Neural network pruning attempts to find a binary mask M (l) for each layer l of a network with L layers represented by its parameters θ and θ (l) , respectively. The overall pruning-mask thus is denoted as M = (M (1) , . . . , M (l) , . . . , M (L) ). These masks specify which parameters of the layer θ (l) to remove (zero out) and which to keep, yielding a reduced parameter set θ θ (l) = θ (l) ⊙ M (l) , where ⊙ is the Hadamard product. Based on this, we define the overall compression rate a as the ratio of parameters preserved after pruning, Θ ̸ =0 , to the number of parameters in the model, Θ. Compression rates for individual layers a (l) are defined analogously. Note, that a network's sparsity is defined inversely, meaning, a 99.9 % sparsity refers to a compression of 0.001. In further follow, we consider θ (l) ∈ R c (l) i ×c (l) o ×k (l) ×k (l) where c o represent input and output channels, respectively, and k (l) is the kernel size. Han et al. (2015) propose a three stage pipeline for network pruning, starting with (1) training an over-parameterized network, followed by (2) removing redundancy per layer according to some pruning criterion, before (3) recovering network performance via fine-tuning. The actual pruning strategy, that decides on the pruning mask M , is obtained in the second step. While integrated approaches (pruning during model training) exist (e.g., Vemparala et al., 2021; Özdenizci & Legenstein, 2021) , the staged process remains most common (Liu et al., 2019; Sehwag et al., 2020; Lee et al., 2022) as it allows to benefit from recent advantages in adversarial training (Shafahi et al., 2019; Zhang et al., 2019; Wang et al., 2020) out-of-the-box.



Figure 1: Overview of pruning weights of a VGG16 model for CIFAR-10 (left) and SVHN (right) with PGD-10 adversarial training. Solid lines show the natural accuracy of HARP, HYDRA, Robust-ADMM and BCS-P (cf. Table7). Dashed lines represent the robustness against AUTOATTACK.

for a PGD-AT trained VGG16 model learned on CIFAR-10 (left) and SVHN (right), providing a first glimpse of the yield advances. No less importantly, we conduct experiments with small (CIFAR-10, SVHN) as well as large-scale (ImageNet) datasets across different robust training methods and various attacks (cf. Section 4.2). • Importance of non-uniform pruning for adversarial robustness. We emphasize the superiority of non-uniform strategies (Zhao et al., 2022) by extending existing adversarially robust pruning-techniques. We demonstrate that HYDRA (Sehwag et al., 2020) and Robust-ADMM (Ye et al., 2019) yield better results when used with non-uniform strategies determined by ERK (Evci et al., 2020) and LAMP (Lee et al., 2021) (cf. Section 4.3).

