IMPROVING THE TRANSFERABILITY OF ADVERSARIAL ATTACKS THROUGH EXPERIENCED PRECISE NESTEROV MOMENTUM Anonymous

Abstract

Deep Neural Networks are vulnerable to adversarial attacks, which makes adversarial attacks serve as a method to evaluate the robustness of DNNs. However, adversarial attacks have high white-box attack success rates but poor transferability, making black-box attacks impracticable in the real world. Momentumbased attacks were proposed to accelerate optimization to improve transferability. Nevertheless, conventional momentum-based attacks accelerate optimization inefficiently during early iterations since the initial value of momentum is zero, which leads to unsatisfactory transferability. Therefore, we propose Experienced Momentum (EM), which is the pre-trained momentum. Initializing the momentum to EM can help accelerate optimization during the early iterations. Moreover, the pre-update of conventional Nesterov momentum based attacks is rough, prompting us to propose Precise Nesterov momentum (PN). PN refines the preupdate by considering the gradient of the current data point. Finally, we integrate EM with PN as Experienced Precise Nesterov momentum (EPN) to further improve transferability. Extensive experiments against normally trained and defense models demonstrate that our EPN is more effective than conventional momentum in the improvement of transferability. Specifically, the attack success rates of our EPN-based attacks are ∼11.9% and ∼13.1% higher than conventional momentum-based attacks on average against normally trained and defense models, respectively.

1. INTRODUCTION

Deep neural networks (DNNs) (Krizhevsky et al., 2012; Szegedy et al., 2015; He et al., 2016; Ioffe & Szegedy, 2015) have been widely applied in computer vision, e.g., autonomous driving (Franchi et al., 2022; Hao et al., 2019; Cococcioni et al., 2018) , facial recognition (Chrysos et al., 2020; Ghenescu et al., 2018) , and medical image analysis (Akselrod-Ballin et al., 2016; Ding et al., 2017; Liu et al., 2019) . However, Szegedy et al. (2013) found that applying certain imperceptible perturbations to images can make DNNs misclassify, and they refer to such perturbed images as adversarial examples (AEs) . Adversarial examples pose a huge threat to the security of DNNs, which attaches extensive attention from researchers. Adversarial attacks can be categorized into white-box attacks and black-box attacks. Typically, iterative gradient-based (Kurakin et al., 2016; Madry et al., 2017) and optimization-based attacks (Carlini & Wagner, 2017) have high white-box but low black-box attack success rates, which means that such two attacks are impracticable in the real world. Transferability, which means adversarial examples crafted on the source model remain effective on other models, makes black-box attacks feasible. Furthermore, iterative gradient-based attacks have the advantages of low computational cost and fast generation speed, thus improving the transferability of iterative gradient-based attacks has become a hotspot in the field of adversarial attacks. Many methods have been proposed to improve the transferability of iterative gradient-based attacks. These methods can be classified into three branches: improving optimization algorithms, input transformations, and disrupting feature space. For example, MI-FGSM (Dong et al., 2018) , NI-FGSM (Lin et al., 2019) , and VM(N)I-FGSM (Wang & He, 2021) improve gradient ascent (or and used to attack the target model (VGG16). Our Experienced MI-FGSM (EMI-FGSM), which integrates EM into MI-FGSM, causes misclassification with higher loss and confidence than MI-FGSM, thus EMI-FGSM can mislead the attention of the target model better than MI-FGSM. descent) algorithm to escape from saddle points and poor local extrema to improve transferability; DIM (Xie et al., 2019) , TIM (Dong et al., 2019) , and SIM (Lin et al., 2019) craft adversarial examples on a set of models derived by input transformations to prevent overfitting and improve transferability; NRDM (Naseer et al., 2018) , FDA (Ganeshan et al., 2019) , and FIA (Wang et al., 2021) disrupt deep features of DNNs to craft highly transferable adversarial examples. Those mentioned above adversarial attacks mostly adopt the momentum (Polyak, 1964; Nesterov, 1983) to accelerate optimization. However, such momentum-based adversarial attacks (e.g., M(N)I-FGSM, VM(N)I-FGSM, and FIA) have the problem of initializing the momentum to zero, resulting in inefficient acceleration due to momentum accumulating few gradients during the first few iterations. Therefore, we propose Experienced Momentum (EM), which is the pre-trained momentum. Before the iterations, the momentum is initialized to EM instead of zero, leading to better acceleration in the first few iterations. The comparison of conventional Polyak momentum (Polyak, 1964) and experienced Polyak momentum is shown in Fig. 1 . To prevent overfitting on the source model, we train EM on a set of models derived by Random Channels Swapping (RCS). EM and RCS are detailed in Sec. 3.1. Furthermore, adversarial attacks (e.g., NI-FGSM and VNI-FGSM) based on Nesterov momentum (i.e., Nesterov Accelerated Gradient, NAG (Nesterov, 1983) ) have the disadvantage that the preupdate is rough. Specifically, during each iteration, the parameters are first pre-updated along the momentum to obtain the pre-update point, which is an estimation of the next position. Then the preupdate is modified by the gradient of the pre-update point. Such looking-ahead property of Nesterov momentum makes parameters escape from saddle points and poor local extrema easier and faster, resulting in improving transferability. However, pre-updating only along the momentum is rough, and the estimation of the next position of the parameters is imprecise. Therefore, we propose Precise Nesterov momentum (PN), which not only retains the looking-ahead property but also refines the pre-update by adopting the gradient of the current data point. To improve transferability further, we integrate EM with PN as Experienced Precise Nesterov momentum (EPN). PN and EPN are detailed in Sec. 3.2. Overall, we make the following contributions: • We propose Experienced Momentum (EM), which is trained on a set of models derived by Random Channels Swapping (RCS). Initializing the momentum to EM can accelerate optimization effectively during the early iterations to improve transferability. • We propose Precise Nesterov momentum (PN), which adopts the gradient of the current data point to refine the pre-update to escape from saddle points and poor local extrema easier and faster. We also integrate EM with PN as Experienced Precise Nesterov momentum (EPN) to improve transferability further. • Extensive experiments on normally trained and defense models demonstrate that our EPN is more effective than conventional momentum for improving transferability.

2.1. TRANSFERABLE ADVERSARIAL ATTACKS

Since adversarial examples were discovered by Szegedy et al. (2013) , many methods (Goodfellow et al., 2014; Kurakin et al., 2016; Carlini & Wagner, 2017) have been proposed to craft adversarial examples to demonstrate the vulnerability of DNNs. We focus on the transferability of iterative gradient-based attacks and review related works from three branches: improving optimization algorithms, input transformations, and disrupting feature space. Improving optimization algorithms. Dong et al. (2018) integrated Polyak momentum (Polyak, 1964) into I-FGSM (Kurakin et al., 2016) to accelerate gradient ascent (or descent) to improve transferability. Inspired by the fact that Nesterov momentum (Nesterov, 1983) 

3. METHODOLOGY

Given a target model f ′ (x; θ ′ ), where x is an input, and θ ′ is the parameters of f ′ . Let J(•, y) be a loss function, where y is the ground-truth label of the input x. A non-targeted adversarial example x adv satisfies f ′ (x; θ ′ ) ̸ = f ′ (x adv ; θ ′ ) under the constraint of ||x adv -x|| p ≤ ϵ, where || • || p denotes the L p norm, and p is generally 0, 1, 2, ∞. In this paper, we focus on p = ∞. Note that our methods can be generalized to p = 0, 1, 2 easily. Crafting non-targeted adversarial examples can be described as solving the following optimization problem: arg max x adv J(f ′ (x adv ; θ ′ ), y), s.t. ||x adv -x|| p ≤ ϵ. In this paper, we focus on non-targeted attacks. Our proposed methods can be easily transformed into targeted attacks by replacing the above objective function with -J(f ′ (x adv ; θ ′ ), y * ), where y * denotes the target label. 

3.1. EXPERIENCED MOMENTUM

Momentum-based attacks initialize momentum to zero, resulting in inefficient acceleration during the first few iterations. Therefore, we propose Experienced Momentum (EM), which is the pretrained momentum. Setting the initial momentum to EM can accelerate the optimization during the early iterations. To prevent overfitting of EM and improve transferability further, we train EM on a set of models derived by Random Channels Swapping (RCS). RCS derives models by randomly swapping the channels of the input image, which is equivalent to randomly swapping the "block" dimensions of the original model, leading to various decision boundaries of derived models. Therefore, training EM on derived models can prevent overfitting. The specific procedure for training EM is as follows. First of all, we perform RCS on the input image x. Specifically, we denote the input image x as an RGB triplet (R, G, B), and then the input image x through RCS can be denoted as S(x), where S (x) ∈ {(R, G, B), (R, B, G), (G, R, B), (G, B, R), (B, R, G), (B, G, R)}, S(•) denotes RCS. Secondly, S(x) is fed into the source model f to derive f (S(•), y). Thirdly, we pre-perturb the input image x on the derived model f (S(•), y) by iterative gradient-based attacks to prevent overfitting. As shown in Fig. 2 , we accumulate gradients to training EM during each iteration. Finally, we follow the above procedure repeatedly to make the EM more generalizable. After training EM, we set the initial value of momentum to EM to accelerate the early iterations.

3.2. PRECISE NESTEROV MOMENTUM

Nesterov momentum based Attack (e.g., NI-FGSM (Lin et al., 2019) and VNI-FGSM (Wang & He, 2021) ) only pre-update along the momentum roughly, resulting in the imprecision of the pre-update point that is the estimate of the next iterative position. Against this disadvantage, we propose Precise Nesterov momentum (PN), which considers the gradient of the current data point in the pre-update to make the pre-update precise. Specifically, during each iteration, the pre-update is performed along the gradient of the current data point and momentum successively to obtain the pre-update point, and then we use the gradient of the pre-update point to modify the pre-update. We integrate PN into I-FGSM as PNI-FGSM. The t-th iteration of PNI-FGSM can be formalized as follows: x adv t = x adv t + α • ∇ x adv t J(f (x adv t ; θ), y) ||∇ x adv t J(f (x adv t ; θ), y)|| 1 + µ • g t-1 , Algorithm 1: Experienced Precise Nesterov momentum I-FGSM (EPNI-FGSM) Input : A source model f with parameters θ and a loss function J. An original image x with ground-truth label y. Input : The maximum perturbation ϵ, the number of iterations T , and the decay factor µ. Input : The epochs of pretraining epochs. Output: An adversarial example x adv . i.e., FIA FI-EPNI-FGSM the combination of Feature Importance-aware (FI) (Wang et al., 2021) and EPNI-FGSM 1 α ← ϵ /T ; g exp ← 0; 2 for n ← 1 to epochs do 3 xadv 1 ← x ; 4 for t ← 1 to T do 5 x adv t ← xadv t + α • ∇ xadv t J(f (S( xadv t );θ),y) ||∇ xadv t J(f (S( xadv t );θ),y)||1 + µ • g exp ; 6 g exp ← ∇ xadv t J(f (S( xadv t );θ),y) ||∇ xadv t J(f (S( xadv t );θ),y)||1 + µ • g exp + ∇ x adv t J(f ( x adv t ;θ),y) ||∇ x adv t J(f ( x adv t ;θ),y)||1 ; 7 xadv t+1 ← Clip (x, g t = ∇ x adv t J(f (x adv t ; θ), y) ||∇ x adv t J(f (x adv t ; θ), y)|| 1 + µ • g t-1 + ∇ x adv t J(f ( x adv t ; θ), y) ||∇ x adv t J(f ( x adv t ; θ), y)|| 1 , x adv t+1 = Clip (x,ϵ) x adv t + α • sign(g t ) , where g t denotes the momentum, g 0 = 0, and µ denotes the decay factor. We combine EM and PN as Experienced Precise Nesterov momentum (EPN) to further improve transferability. The algorithm of EPNI-FGSM, which integrates EPN into I-FGSM, is summarized in Algorithm 1. Particularly, if  ∇ x adv t J(f ( x adv t ;θ),y) ||∇ x adv t J(f ( x adv t ;θ),y)||1 = 0, EPNI-FGSM degrades to Experienced MI-FGSM (EMI-FGSM). If

4. EXPERIMENTS

We conduct extensive experiments on normally trained and defense models to validate that our EPN is more efficient than conventional momentum. We first present the experimental settings in Sec. 4.1. Then, we report the results for attacking normally trained and defense models in Sec. 4.2 and Sec. 4.3, respectively. Finally, we provide ablation studies in Sec. 4.4. Table 1 introduces the abbreviations used in the paper.

4.1. EXPERIMENTAL SETTINGS

Dataset. We follow the previous works (Dong et al., 2019; Wang et al., 2021) to use the DEV dataset from the NIPS17 Adversarial Attacks and Defenses Competition. This dataset contains 1000 images with size 299 × 299. Target Models. Seventeen normally trained models, i.e., GoogLeNet (Iv1) (Szegedy et al., 2015) , Inception-v3 (Iv3) (Szegedy et al., 2016) , Inception-v4 (Iv4), Inception-ResNet-v2 (IRv2) (Szegedy et al., 2017) Baselines. For fair comparison of our EPN and conventional momentum, we replace conventional momentum with our EPN in momentum-based attacks, i.e., MI-FGSM (Dong et al., 2018) , NI- FGSM (Lin et al., 2019) , DI-MI-FGSM (Xie et al., 2019) , TI-MI-FGSM (Dong et al., 2019) , SI-NI-FGSM (Lin et al., 2019) , VT-MI-FGSM (Wang & He, 2021) , VT-NI-FGSM (Wang & He, 2021) and FI-MI-FGSM (Wang et al., 2021) . Then we compare the transferability of conventional momentumbased attacks and our EPN-based attacks. Hyperparameters. In all experiments, we follow the official default settings for hyperparameters. Specifically, the maximum perturbation ϵ = 16, the number of iterations T = 10, the step size α = ϵ /T = 1.6, and the decay factor µ = 1.0. For DIM (Xie et al., 2019) , the probability p is set to 0.5. For TIM (Dong et al., 2019) , the size of the Gaussian kernel is set to 15×15. For SIM (Lin et al., 2019) , the number of scale copies m is set to 5. For VT-MI-FGSM (Wang & He, 2021) and VT-NI-FGSM (Wang & He, 2021) , the number of sampled examples N is set to 20, and the parameter β for the upper bound of the neighborhood is set to 1.5. For FI-MI-FGSM (Wang et al., 2021) , the drop probability p d is set to 0.3 when attacking normally trained models and 0.1 when attacking defense models, the ensemble number N is set to 30 in aggregate gradient, and the intermediate layer is set to Mixed 5b for Iv3, Conv 4a for IRv2, Conv3 3 for V16 as well as the last layer of the second block for R152. For our EM-based attacks, epochs is set to 5.

4.2. ATTACK NORMALLY TRAINED MODELS

To validate that EPN-based attacks have higher transferability than conventional momentum-based attacks, we choose Iv3, IRv2, R152, and V16 as the source model, respectively, and attack normally trained target models via our EPN-based methods and baseline methods. The attack success rates are shown in Table 2 . The results show that the attack success rates of our EPN-based methods are ∼11.9% higher than baseline methods on average, In particular, our EPN-based attacks have the best transferability against normally trained target models when the source model is R152. Specifically, the attack success rates of EPNI-FGSM, DI-EPNI-FGSM, VT-EPNI-FGSM, and FI-EPNI-FGSM are 92.0%, 97.2%, 96.0%, and 96.0%, respectively, on average. Therefore, the experiments demonstrate that our EPN improves transferability more effectively than conventional momentum against normally trained models.

4.3. ATTACK DEFENSE MODELS

To further compare the transferability, we also use defense models as the target models, and the source models are still Iv3, IRv2, R152, and V16. We craft adversarial examples on the source model via our EPN-based methods and baseline methods to attack defense models. The attack success rates are shown in Table 3 . The results show that the attack success rates of our EPNbased methods are ∼13.1% higher than baseline methods on average. Adversarial examples crafted on R152 still show the best transferability against defense models. Specifically, the attack success rates of EPNI-FGSM, DI-EPNI-FGSM, VT-EPNI-FGSM, and FI-EPNI-FGSM are 51.7%, 75.8%, 73.6%, and 76.9%, respectively, on average. The results of experiments indicate that our EPN is still more effective than conventional momentum against defense models.

4.4. ABLATION STUDY

We conduct ablation studies for EPNI-FGSM. We investigate the impacts of two hyperparameters (i.e., the decay factor µ and the epochs of pretraining epochs) on the transferability of EPNI-FGSM in Sec. 4.4.1. We further study the impacts of EM and PN on transferability in Sec. 4.4.2.

4.4.1. IMPACTS OF µ AND epochs

The source models are set to Iv3, IRv2, R152, and V16. We use EPNI-FGSM to craft adversarial examples to attack normally trained models and defense models, respectively. We investigate the impacts of µ and epochs on the transferability of EPNI-FGSM by counting the average attack success rates against normally trained models (except the source model) and defense models. The decay factor µ. The decay factor µ plays a vital role for momentum. If µ = 0, the momentumbased attacks degrade to vanilla iterative gradient-based attacks. If 0 < µ < 1, the previous gradients accumulated in the momentum decay exponentially. If µ = 1, the momentum simply adds up all previous gradients. If µ > 1, the previous gradients accumulated in the momentum grow exponentially. We pre-set epochs = 5 and set µ from 0.0 to 2.0 with a step size of 0.1. The average attack success rates are shown in Fig. 3 . When µ ≤ 1.0, the average attack success rates show an upward trend, and when µ ≥ 1.0, the average attack success rates show a downward trend. Therefore, we set µ = 1.0 for EPNI-FGSM to achieve the best transferability. The epochs of pretraining epochs. epochs affects the amount of gradient accumulated in EM. We pre-set µ = 1.0 and set epochs from 0 to 10 with a step size of 1. The average attack success rates are shown in Fig. 4 . As epochs increases, the average attack success rates increase and gradually converge. Since the larger epochs, the higher the computational cost, we set epochs = 5 for EPNI-FGSM to strike a balance between computational cost and transferability. In summary, we set the decay factor µ = 1.0 and the epochs of pretraining epochs = 5 for EPNI-FGSM. Similarly, such two hyperparameters of other EPN-based attacks have the same settings as EPNI-FGSM.

4.4.2. IMPACTS OF EM AND PN

The source models are the same as in Sec 4.4.1. To investigate the impacts of EM and PN, we craft adversarial examples on source models via ENI-FGSM (only with EM), PNI-FGSM (only with PN), and EPNI-FGSM (with EM and PN), respectively. In addition, we also use MI-FGSM and NI-FGSM (without EM and PN) as baselines. For ENI-FGSM, the epochs of pretraining epochs is set to 5. For ENI-FGSM and PNI-FGSM, the decay factor µ is set to 1.0. The average attack success rates of the adversarial examples against normally trained models and defense models are shown in Fig. 5 . The average attack success rates of ENI-FGSM are higher than MI-FGSM and NI-FGSM, demonstrating that EM improves transferability more than conventional momentum. The same is true for PN. Besides, the average attack success rates of EPNI-FGSM are higher than that of ENI-FGSM and PNI-FGSM, demonstrating that the combination of EM and PN can further improve transferability.

5. CONCLUSION

In this work, we proposed Experienced Momentum (EM) and Precise Nesterov momentum (PN) to boost transferability. Specifically, EM is trained on a set of derived models by Random Channels Swapping (RCS), and then conventional momentum is initialized to EM, which can accelerate optimization to escape from saddle points and poor local extrema during early iterations to improve transferability. Additionally, we adopted the current gradient to refine the pre-update conventional Nesterov momentum, called PN. Then, we naturally combined EM and PN as EPN to improve transferability further. Extensive experiments demonstrate that EPN-based attacks have higher transferability than conventional momentum-based attacks. However, our methods still adopt a fixed learning rate or step size that is crucial for the optimizer. Therefore, we will explore the impact of learning rate or step size on transferability in future work.



Figure 1: Comparison of conventional Polyak momentum and experienced Polyak momentum. Adversarial examples are crafted on the source model (Inception-v3) and used to attack the target model (VGG16). Our Experienced MI-FGSM (EMI-FGSM), which integrates EM into MI-FGSM, causes misclassification with higher loss and confidence than MI-FGSM, thus EMI-FGSM can mislead the attention of the target model better than MI-FGSM.

Figure 2: Illustration of training EM during each iteration.

Abbreviation Explanation D(T)I-MI-FGSM the combination of D(T)IM and MI-FGSM SI-NI-FGSM the combination of SIM and NI-FGSM D(T,S)I-EPNI-FGSM the combination of D(T,S)IM and EPNI-FGSM VT-M(N)I-FGSM i.e., VM(N)I-FGSM VT-EPNI-FGSM the combination of Variance Tuning (VT) (Wang & He, 2021) and EPNI-FGSM FI-MI-FGSMi.e., FIA FI-EPNI-FGSM the combination of Feature Importance-aware (FI)(Wang et al., 2021)  and EPNI-FGSM

),y)||1 = 0, EPNI-FGSM degrades to Experienced NI-FGSM (ENI-FGSM). If epochs = 0, EPNI-FGSM degrades to PNI-FGSM.

, ResNet-18 (R18), ResNet-34 (R34), ResNet-50 (R50), ResNet-101 (R101), ResNet-152 (R152)(He et al., 2016), VGG11 (V11), VGG13 (V13), VGG16 (V16), VGG19 (V19) (Simonyan & Zisserman, 2014), DenseNet-121 (D121), DenseNet-169 (D169), DenseNet-201 (D201), and DenseNet-161 (D161)(Huang et al., 2017). Ten defense models (i.e., adversarially trained models), i.e., Adv-Inception-v3 (Iv3 adv ), Ens-Inception-Resnet-v2 (IRv2 ens )Tramèr et al. (2017), Adv-EfficientNet-b0 (Eb0 adv ) to Adv-EfficientNet-b7 (Eb7 adv ).

Figure 3: The average attack success rates (%) of the adversarial examples crafted on source models against normally trained models (except the source model) and defense models for various µ.

Figure 4: The average attack success rates (%) of the adversarial examples crafted on source models against normally trained models (except the source model) and defense models for various epochs.

Figure 5: The average attack success rates (%) of the adversarial examples crafted on source models against normally trained models and defense models via NI-FGSM, ENI-FGSM, PNI-FGSM, and EPNI-FGSM.

is superior to Polyak momentum,Lin et al. (2019) integrated Nesterov momentum into I-FGSM to improve transferability further.Wang & He (2021) used the gradient variance of the previous iteration to tune the current gradient to stabilize the update direction and escape from saddle points and poor local extrema.

ϵ) xadv The abbreviations used in the paper.

The attack success rates (%) of adversarial examples crafted on source models against normally trained target models. "*" indicates the model being white-box attacked. "Avg" means the average attack success rate.

The attack success rates (%) of adversarial examples crafted on source models against defense models. "Avg" means the average attack success rate.

