THE ULTIMATE COMBO: BOOSTING ADVERSARIAL EXAMPLE TRANSFERABILITY BY COMPOSING DATA AUGMENTATIONS

Abstract

Transferring adversarial examples (AEs) from surrogate machine-learning (ML) models to evade target models is a common method for evaluating adversarial robustness in black-box settings. Researchers have invested substantial efforts to enhance transferability. Chiefly, attacks leveraging data augmentation have been found to help AEs generalize better from surrogates to targets. Still, prior work has explored a limited set of augmentation techniques and their composition. To fill the gap, we conducted a systematic study of how data augmentation affects transferability. Particularly, we explored ten augmentation techniques of six categories originally proposed to help ML models generalize to unseen benign samples, and assessed how they influence transferability, both when applied individually and when composed. Our extensive experiments with the ImageNet and CIFAR-10 dataset showed that simple color-space augmentations (e.g., color to greyscale) outperform the state of the art when combined with standard augmentations, such as translation and scaling. Additionally, except for two methods that may harm transferability, we found that composing augmentation methods impacts transferability monotonically (i.e., more methods composed → ≥transferability)-the best composition we found significantly outperformed the state of the art (e.g., 95.6% vs. 92.0% average transferability on ImageNet from normally trained surrogates to other normally trained models). We provide intuitive, empirically supported explanations for why certain augmentations fail to improve transferability.

1. INTRODUCTION

Adversarial examples (AEs)-variants of benign inputs minimally perturbed to induce misclassification at test time-have emerged as a profound challenge to machine learning (ML) (Biggio et al., 2013; Szegedy et al., 2014) , calling its use in security-and safety-critical systems into question (e.g., Eykholt et al. (2018) ). Many attacks have been proposed to generate AEs in white-box settings, where adversaries are familiar with all the particularities of the attacked model (Papernot et al., 2016) . By contrast, black-box attacks enable evaluating the vulnerability of ML in realistic settings, without access to the model (Papernot et al., 2016) . Attacks exploiting the transferability-property of AEs (Szegedy et al., 2014) have received special attention. Namely, as AEs produced against one model are often misclassified by others, transferability-based attacks produce AEs against surrogate (a.k.a. substitute) white-box models to mislead black-box ones. To measure the risk of AEs in black-box settings accurately, researchers have proposed varied methods to enhance transferability (e.g., Lin et al. (2020) ; Liu et al. (2017) ). Notably, attacks using data augmentation, such as translations (Dong et al., 2019) and scaling of pixel values (Lin et al., 2020) , as a means to improve the generalizability of AEs across models have accomplished state-of-the-art transferability rates. Still, previous transferability-based attacks have studied only four augmentation methods (see Section 3.1), out of many proposed in the dataaugmentation literature (Shorten & Khoshgoftaar, 2019) , primarily for reducing model overfitting. Hence, the extent to which different data-augmentation types boost transferability, either individually or when combined, remains largely unknown. To fill the gap, we conducted a systematic study of how augmentation methods influence transferability. Specifically, alongside techniques considered in previous work, we studied how ten augmentation techniques pertaining to six categories impact transferability when applied individually or composed (Section 3). Integrating augmentation methods into attacks via a flexible framework we propose (Algorithm 1), we conducted extensive experiments using an ImageNet-compatible dataset, CIFAR-10 (Krizhevsky, 2009) , and 16 models, and measured transferability in diverse settings, including with and without defenses (Sections 4 and 5). Our results offer several interesting insights: • Simple color-space augmentations outperform state-of-the-art transferability-based attacks when composed with standard augmentations (Section 5.1). • Transferability has a mostly monotonic relationship with data-augmentation techniques. Except for two augmentation methods that may harm transferability, composing additional augmentation methods either improves of preserves transferability (Section 5.2). • Out of 2 7 compositions explored, the best composition we found, ULTIMATECOMBO, outperforms state-of-the-art attacks by a large margin (Section 5.3). • We show empirical support to conjectures we raise concerning when data-augmentation techniques may be counterproductive to transferability (Section 5.4).

2. BACKGROUND AND RELATED WORK

Evasion Attacks Many evasion attacks assume adversaries have white-box access to models-i.e., adversaries know models' architectures and weights (e.g., Goodfellow et al. (2015) ; Szegedy et al. (2014) ; Carlini & Wagner (2017) ). These typically leverage first-or second-order optimizations to generate AEs models would misclassify. For example, given an input x of class y, model weights θ, and a loss function J, the Fast Gradient Sign method (FGSM) of Goodfellow et al. (2015) , crafts an AE x using the loss gradients ∇ x J(x, y, θ): x = x + * sign(∇ x J(x, y, θ)) where sign(•) maps real numbers to -1, 0, or 1, depending on their sign. Following FGSM, researchers proposed numerous advanced attacks. Notably, iterative FGSM (I-FGSM) of Kurakin et al. (2017b) performs multiple gradient-ascent steps, updating x iteratively to evade models: xt+1 = Proj x xt + α • sign ∇ x J (x t , y, θ) where Proj x (•) projects the perturbation into ∞ -norm -ball centered at x, α is the step size, and x0 = x. The attacks we study in this work are based on I-FGSM. In practice, adversaries often lack white-box access to victim models. Hence, researchers studied black-box attacks in which adversaries may only query models. Certain attack types, such as scoreand boundary-based attacks perform multiple queries, often around several thousands, to produce AEs (e.g., Brendel et al. (2018) ; Ilyas et al. (2019)) . By contrast, attacks leveraging transferability (e.g., Goodfellow et al. (2015) ; Szegedy et al. (2014) ) avoid querying victim models, and use surrogate white-box models to create AEs that are likely misclassified by other black-box ones. Attempts to explain the transferability phenomenon attribute it to gradient norm of the target model (i.e., its susceptibility to attacks), the smoothness of classification boundaries, and, primarily, the alignment of gradient directions between the surrogate and target models (Demontis et al., 2019; Yang et al., 2021) . Said differently, for AEs to transfer, the gradient directions of surrogates need to be similar to those of target models (i.e., attain high cosine similarity). Enhancing transferability is an active research area. Some methods integrate momentum into attacks such as I-FGSM to avoid surrogate-specific optima and saddle points that may hinder transferability (e.g., Dong et al. (2018) ; Wang & He (2021) ). Others employ specialized losses, such as reducing the variance of intermediate activations (Huang et al., 2019) or the mean loss of model ensembles (Liu et al., 2017) , to enhance transferability. Lastly, a prominent family of attacks leverages data augmentation to enhance AEs' generalizability between models. For instance, Dong et al. (2019) boosted transferability by integrating random translations into I-FGSM. Evasion attacks incorporating data augmentation attain state-of-the-art transferability rates (Lin et al., 2020; Wang et al  1: α = /T 2: x0 = x # Initialize adversarial example 3: g 0 = 0 # Initialize momentum 4: for t = 0 to T -1 do 5: ḡt+1 = 1 m m-1 i=0 ∇ x (J (D(x t ) i , y, θ)) # Expected g t+1 = µ • g t + ḡt+1 ḡt+1 1 # Gradient with momentum 7: xt+1 = Proj x xt + α • sign (g t+1 ) # Update adversarial example 8: return x = xT 2021a). Nonetheless, prior work has only considered a restricted set of augmentation methods for boosting transferability. By contrast, we aim to investigate the role of data augmentation at enhancing transferability more systematically, by exploring how a more comprehensive set of augmentation types and their compositions affect transferability. Defenses Various defenses have been proposed to mitigate evasion attacks. Adversarial training-a procedure integrating correctly labeled AEs in training-is one of the most practical and effective methods for enhancing adversarial robustness (e.g., Goodfellow et al. (2015) ; Tramèr et al. (2018) ). Other defense methods sanitize inputs prior to classification (e.g., Guo et al. (2018) ); attempt to detect attacks (see Tramer (2022) ); or seek to certify robustness in -balls around inputs (e.g., Cohen et al. (2019) ; Salman et al. ( 2019)). Following standard practices in the literature (Wang et al., 2021a) , we evaluate transferability-based attacks against a representative set of these defense.

3. DATA AUGMENTATION FOR ENHANCING TRANSFERABILITY

Data augmentation is traditionally used in training, to reduce overfitting and improve generalizability (Shorten & Khoshgoftaar, 2019) . Inspired by this use, transferability-based attacks adopted data augmentation to limit overfitting to surrogate models and produce AEs likely to generalize and be misclassified by victim models. Algorithm 1 depicts a general framework for integrating data augmentation into I-FGSM with momentum (MI-FGSM). In the framework, a method D(•) augments the attack with m variants of the estimated AE at each iteration. Consequently, the adversarial perturbation found by the attack increases the expected loss over transformed counterparts of the benign sample x (i.e., the distribution set by D(•) given x). Note that D(•)'s output may include x. The framework in Algorithm 1 is flexible, and can admit any data-augmentation method. We use it to describe previous attacks employing data augmentation and to systematically explore new ones. Next, we detail previous attacks, describe data augmentation methods we adopt for the first time to enhance transferability, and explain how these can be combined for best performance.

3.1. PREVIOUS ATTACKS LEVERAGING DATA AUGMENTATION

Previous work explored the following augmentation methods to set D(•). Translations Using random translations of inputs, Dong et al. (2019) Scaling Pixels Lin et al. (2020) showed that adversarial perturbations invariant to scaling pixel values transfer with higher success between deep neural networks (DNNs). In their case, D(•) produces m samples such that D(x) i = x 2 i for i ∈ {0, 1, ..., m -1}, where m=5 by default. Admix Wang et al. (2021a) assumed that the adversary has a gallery of images from different classes and adopted augmentations similar to MixUp (Zhang et al., 2018a) . For each sample x from the gallery, Admix augments attacks with m (typically set to 5) samples, such that D(x, x ) i = 1 2 i • (x t + η • x ) , where i ∈ {0, 1, ..., m -1}, and η ∈ [0, 1] is set to 0.2 by default. Notably, Admix degenerates to pixel scaling when η = 0. The leading transferability-based attacks compose (1) diverse inputs, scaling, and translations (Lin et al.'s (2020) DST-MI-FGSM attack, and Wang & He's (2021) DST-VMI-FGSM attack that also tunes the gradients' variance); or (2) Admix, diverse inputs, and translation (Wang et al.'s (2021a) Admix-DT-MI-FGSM attack). We describe how the compositions operate in Section 3.3.

3.2. NEW AUGMENTATIONS FOR ENHANCING TRANSFERABILITY

While prior work studied the effect of spatial transformations (i.e., translations and diverse inputs), pixel scaling, and mixing on transferability, a substantially wider range of data-augmentation methods exist. Yet, the impact of these on transferability remains unknown. To fill the gap, we examined Shorten & Khoshgoftaar's (2019) survey on data augmentation for reducing overfitting in deep learning and identified ten representative methods of six categories that may boost transferability. We present them in what follows, one category at a time. Color-space Transformations Potentially the simplest of all augmentation types are those applied in color-space. Given images represented as three-channel tensors, methods in this category manipulate pixel values only based on information encoded in the tensors. We evaluate four color-space transformations. First, we consider color jitter (CJ), which applies random color manipulation (Wu et al., 2015) . Specifically, we consider random adjustments of pixel values within a pre-defined range in terms of hue, contrast, saturation, and brightness around original values. Second, we evaluate fancy principle component analysis (fPCA). Used in AlexNet (Krizhevsky et al., 2017) , fPCA adds noise to the image proportionally to the variance in each channel. Given an RGB image, fPCA adds the following quantity to each image pixel: [p 1 , p 2 , p 3 ] [α 1 λ 1 , α 2 λ 2 , α 3 λ 3 ] T , where p i and λ i are the i th eigenvector and eigenvalue of the of 3× 3 covariance matrix of RGB pixels, respectively, and α i is sampled once per image from Gaussian distribution N (0, 0.1). Third, we test channel shuffle (CS). Included in ShuffleNet training (Zhang et al., 2018b) , CS simply swaps the orders of the image's RGB channels at random. Last, but not least, we consider greyscale (GS) augmentations. This simple augmentation converts images into greyscale (replicating it three times to obtain an RGB representation). Mathematically, the conversion is calculated by ω R • x R + ω G • x G + ω B • x B , where x R , x G , and x B , correspond to the RGB channels, respectively, and ω R , ω G , and ω B , all ∈ [0, 1], denote the channel weights, and sum up to 1. Random Erasing Inspired by dropout regularization, random erasing (RE) helps ML models focus on descriptive features of images and promote robustness to occlusions (Zhong et al., 2020) . To do so, randomly selected rectangular regions in images are replaced by masks composed of random pixel values. Similarly to RE, CutOut masks out regions of inputs to improve DNNs' accuracy (De-Vries & Taylor, 2017) . The main difference from e is that CutOut uses fixed masking values, and may perform less aggressive masking when selected regions lie outside the image. Kernel Filters Convolving images with kernels of different types can produce certain effects, such as blurring (via Gaussian kernels), sharpening (via edge filters), or edge enhancement. We study the effect of sharpening (Sharp) on transferability with edge-enhancement filters. Mixing Images As a form of vicinal risk minimization, some augmentation methods mix images together, creating virtual examples for training. MixUp, the cornerstone behind Admix, computes weighted sums of images (Zhang et al., 2018a) . By contrast, we consider CutMix, which replaces a region within one image with a region from another image picked from a gallery (Yun et al., 2019) . Neural Transfer Augmentations using neural transfer (NeuTrans) preserve image semantics while changing their style. We use Gatys et al.'s (2015) generative model to transfer image styles to that of Picasso's 1907 self-portrait. Meta-learning-inspired Augmentations Meta-learning is a subfield of ML studying how ML algorithms can optimize other learning algorithms (Hospedales et al., 2021) . In the context of data augmentation, algorithms such as AutoAugment have been proposed to train controllers to select an appropriate augmentation method to avoid overfitting (Cubuk et al., 2019) . We use the pre-trained AutoAugment controller, encoded as a recurrent neural network, to select augmentation methods and their magnitude from a set of 13 augmentation methods.

3.3. COMPOSING AUGMENTATIONS

There are two ways to compose data-augmentation methods in attacks, namely: parallel and serial composition. Figure 1 in Appendix A illustrates both. In parallel composition, each augmentation method is applied independently on the input, and their outputs are aggregated by taking their union to augment attacks (i.e., as D(•)'s output). By contrast, serial composition applies augmentation methods sequentially, one after the other, where the first method operates on the original sample, and each subsequent augmentation function operates on its predecessor's outputs. Consequently, serial composition leads to an exponential growth in the number of samples, while parallel composition leads to a linear growth. DST-MI-FGSM and Admix-DT-MI-FGSM use serial composition. By contrast, we consider a substantially larger number of augmentation methods, which may lead to prohibitive memory and time requirements in the case of serial composition. Additionally, because the order of applying certain augmentations matters (e.g., GS then CutMix leads to different outcome that CutMix followed by GS), exploring a meaningful number of serial compositions (out of an order of 10! possibilities) becomes virtually impossible. Accordingly, we mainly consider parallel composition between data-augmentation methods. We only serially compose translations, scaling, and diverse inputs, for consistency with prior work (e.g., Wang et al. (2021a) ). We tested a few serial compositions between new augmentation methods we consider and found they were significantly outperformed by their parallel counterparts. While non-exhaustive, this hints that serially composing augmentations may not be a promising direction for enhancing transferability.

4. EXPERIMENTAL SETUP

Now we turn to the setup of our experiments, including the data, models, and attack configurations. Data We used an ImageNet-compatible datasetfoot_0 and CIFAR-10 for evaluation, per common practice (e.g., (Dong et al., 2019; Yang et al., 2021) ). The former contains 1,000 images, originally collected for the NeurIPS 2017 adversarial ML competition. For the latter, we sampled 1,000 images, roughly balanced between classes, from the test set. Models We used 16 DNNs to transfer attacks from (as surrogates) and to (as targets)-six for CIFAR-10 and ten for ImageNet. All CIFAR-10 models and six of the ImageNet models were normally trained, while the other four ImageNet models were adversarially trained. To facilitate comparison with prior work, we included models that are widely used for assessing transferability (e.g., (Wang et al., 2021a; Yang et al., 2021) ). Furthermore, to ensure that our findings are general, we included models covering varied architectures, including Inception, ResNet, VGG, DenseNet, and MobileNet. Appendix B provides more details about the models. Attack Parameters We tested standard attack configurations, in line with prior work (Wang et al., 2021a; Yang et al., 2021) . Namely, we evaluated untargeted MI-FGSM-based attacks, bounded in ∞ -norm. We validated findings with varied perturbation norms. For ImageNet, unless stated otherwise, we tested = 16 255 , but also experimented with ∈ { 8 255 , 24 255 }. For CIFAR-10, we experimented with ∈ {0.02, 0.04}. We quantified attack success via transferability rates-the percentages of attempts at which AEs created against surrogates were misclassified by victims. As baselines, we used three state-of-the-art transferability-based attacks: DST-MI-FGSM, DST-VMI-FGSM, and Admix-DT-MI-FGSM (see Section 3.1). Appendix C reports the parameters used in attacks and augmentation methods. Appendix D discusses attacks we considered but excluded from experiments.

5. EXPERIMENTAL RESULTS

This section summarizes our findings. We start by evaluating individual augmentation methods and standard combinations with scaling, diverse inputs, and translations (Section 5.1). We then turn to analyzing all possible compositions between different augmentation types to assess whether transferability typically improves when considering additional augmentations (Section 5.2). Our analysis helped us identify the best performing composition for boosting transferability, denoted by UL-TIMATECOMBO, outperforming state-of-the-art attacks. Section 5.3 reports rigorous comparisons between ULTIMATECOMBO and the baselines, including against defended models. Finally, we help develop intuition for when augmentations may or may not help improve transferability (Section 5.4).

5.1. COLOR-SPACE AUGMENTATIONS SIGNIFICANTLY ADVANCE THE STATE OF THE ART

Initially, we evaluated transferability integrating a single augmentation at a time in attacks, or when composing individual augmentations with diverse inputs, scaling, and translation (DST), as is standard (Lin et al., 2020; Wang et al., 2021a) . We found that considering each of the ten augmentations individually does not lead to competitive performance with the baselines (Table 9 in Appendix E). However, composing individual augmentations with DST enhanced transferability markedly (Table 10 in Appendix E). Surprisingly, augmentations in color-space fared particularly well, outperforming the baselines and advanced augmentation methods (e.g., AutoAugment) in most cases. Composing GS with DST (GS-DST-MI-FGSM attack) performed best in this setting. Table 1 reports the transferability rates from four normally trained models to other models on ImageNet (see Table 11 in Appendix F for more details). It can be immediately seen that GS-DST-MI-FGSM attains higher transferability than the baselines (93.6% vs. ≤92.0%, on avg.). This held also when considering different perturbation norms on ImageNet, where GS-DST-MI-FGSM outperformed the baselines with sometimes higher margin (e.g., 75.9% vs. ≤70.8% on avg. with = 8 255 ; see Table 13 in Appendix F). GS-DST-MI-FGSM also outperformed the baselines on CIFAR-10, when transferring AEs to normally trained DNNs of different architectures, with perturbation norms =0.02 (74.9% vs. ≤71.5% avg. transferability rate) and =0.04 (92.1% vs. ≤89.6% avg. transferability rate). Tables 14 and 15 in Appendix G show the detailed CIFAR-10 results. 12 shows complete results), as well as an ensemble of DNNs (Table 3 ) used to boost transferability further (Liu et al., 2017) , finding that GS-DST-MI-FGSM attained better transferability than the baselines. Overall, according to a paired t-test, the differences between GS-DST-MI-FGSM and the baselines across different surrogate and target models were statistically significant (p <0.01). Finally, we evaluated attack run-times, finding that, despite investing no effort to improve its efficiency, GS-DST-MI-FGSM is at least ×1.14 more time-efficient than Admix-DT-MI-FGSM and DST-VMI-FGSM, on avg. (Table 16 in Appendix I). Still, we denote that, since transferabilitybased attacks generate AEs offline, and only once per surrogate model, as long as an attack is not prohibitively slow, attack run-time is a marginal consideration for selecting an attack compared to transferability rates.

5.2. THE MONOTONICITY OF TRANSFERABILITY WHEN ADDING AUGMENTATIONS

We wanted to evaluate whether transferability is monotonic in the number of augmentation types considered-i.e., whether composing more techniques increases, or at least does not harm, transferability. To this end, we selected the best performing augmentation method of each of the six categories presented in Section 3.2 as well as DST-MI-FGSM, and evaluated all 2 7 (=128) compositions possible (per Section 3.3). More precisely, we tested every possible combination of GS, CutOut, Sharp, NeuTrans, AutoAugment, Admix, and DST-MI-FGSM. Given a composition, we produced AEs against the Inc-v3 ImageNet DNN as surrogate, and computed the expected transferability rate against all other nine ImageNet DNNs, both normally and adversarially trained. Then, for every pair of attacks differing only in whether a single augmentation method was incorporated in the composition, we tested whether adding the augmentation method improved transferability. The results reflected a mostly monotonic relationship between transferability and augmentations. Except for NeuTrans and Sharp, which sometimes harmed transferability when considered within a composition, adding augmentation method increased or preserved transferability. Figure 2 in Appendix H summarizes the results. Notably, comparing all compositions enabled us to find that a composition of all seven augmentation methods except for NeuTrans attained the best transferability. We call this composition the ULTIMATECOMBO.

5.3. THE MOST EFFECTIVE COMBINATION

We evaluated ULTIMATECOMBO extensively, testing transferability to normally and adversarially trained DNNs. As shown in Table 1 , ULTIMATECOMBO obtained higher transferability to normally trained models than the baselines (95.6% vs. ≤92.0% avg. transferability) and GS-DST-MI-FGSM, when normally trained models were used as surrogates. This holds across different values of (Table 13 ), and on the CIFAR-10 dataset with different architectures (Tables 14 and 15 ). Furthermore, ULTIMATECOMBO achieved the best performance also when transferring attacks from normally trained to adversarially trained DNNs (Table 2 ; 86.0% vs. ≤82.7% avg. transferability). Transferring AEs crafted by ULTIMATECOMBO using an ensemble of models increased transferability further (Table 3 ; 93.4% avg. transferability). Per a paired t-test, the differences between ULTIMATECOMBO and the baselines over all pairs of surrogates and targets considered are statistically significant (p <0.01). Besides adversarially trained models, we evaluated ULTIMATECOMBO's transferability against five defenses. Two defenses, bit reduction (Bit-Red) (Xu et al., 2018) and neural representation purification (NRP) (Naseer et al., 2020) , transform inputs to sanitize adversarial perturbations. Two others, randomized smoothing (RS) (Cohen et al., 2019) Lastly, due to composing more augmentations, ULTIMATECOMBO is slower than DST-MI-FGSM, GS-DST-MI-FGSM, and Admix-DT-MI-FGSM. However, it is ×2.44 faster than DST-VMI-FGSM at producing AEs (Table 16 ).

5.4. WHEN DO AUGMENTATIONS FAIL TO IMPROVE TRANSFERABILITY?

While augmentation methods mostly increased transferability, in some cases they were counterproductive. Particularly, NeuTrans and Sharp decreased transferability when composed with certain methods. We conducted simple experiments as a preliminary assessment of two conjectures we had concerning when augmentations may harm transferability. First, we expected augmentation methods that harm model accuracy on benign samples to be less conducive for transferability. As DNNs do not generalize well to benign samples produced by these augmentation methods, we anticipated that adversarial perturbations relying on the augmented samples would also have limited generalizability across models. To support the conjecture, we tested the normally trained DNNs' accuracy on benign samples transformed by each augmentation method. As can be seen from Table 6 , NeuTrans and Sharp, which often decrease transferability (Section 5.2 and Figure 2 ), harmed the DNN accuracy the most (6.5%-58.7% lower accuracy than other methods), supporting our conjecture. Prior work demonstrated that gradient alignment between surrogates and targets is needed for transferability (Demontis et al., 2019) . Thus, we expected augmentation methods that estimate target model gradients more accurately to increase transferability further. To assess this conjecture, we evaluated the cosine similarity between the gradients of the Inc-v3 model while using augmentations composed with DST applied to benign samples, and the gradients of other normally trained models on (untransformed) benign samples. The results (Table 7 ) show some support to the conjecture-NeuTrans and Sharp led to lower cosine similarities with target models' gradients. Yet, the differences in cosine similarities between augmentation methods were small (≤0.031, on avg.).

6. CONCLUSION AND FUTURE WORK

Our study uncovered a mostly monotonic relationship between data-augmentation methods and transferability, and helped us identify a simple yet effective composition of data-augmentation methods, ULTIMATECOMBO, that outperforms previously proposed methods when integrated into attacks. The resulting attack should be considered as a standard baseline in follow-up work on transferability. Our work also puts forward conjectures for when augmentation techniques are expected to improve transferability, and offers some empirical support. In the future, it would be informative to develop a theory that formally explains why augmentation methods help increase transferability. Furthermore, instead of relying on existing augmentation methods originally developed to improve DNN generalizability, an intriguing research direction would be to develop augmentation techniques tailored specifically for improving transferability. Lastly, in addition for assessing the vulnerability of ML models in black-box settings, it would be interesting to evaluate whether the ULTIMATECOMBO-based attack advances methods leveraging AEs for defensive purposes, by deceiving adversaries (e.g., to attain privacy (Cherepanova et al., 2021; Shetty et al., 2018) ). [0.3, 3.3] . For Sharp, we used the following edgeenhancement mask: -0.5 -0.5 -0.5 -0.5 5.0 -0.5 -0.5 -0.5 -0.5 . For diverse inputs, images were transformed with probability 0.5. For the Admix operation, consistently with Wang et al. (2021a) , we randomly sampled three images from other categories for mixing as part of the Admix-DT-MI-FGSM attack. However, for the interest of computational efficiency, we use only one image for mixing when composing Admix with other augmentation methods. We did not find that mixing with fewer images harmed performance. In fact, it even improved transferability in some cases. Finally, in CutMix, we picked the top left coordinate (r x , r y ), the width, r w , and height, r h , of the region to be replaced, using the formulas: r x ∼ U(0, W ), r w = W √ 1 -λ, r y ∼ U(0, H), r h = H √ 1 -λ, where U is the uniform distribution, W is the image width, H is the image height, and λ is a parameter set to 0.5. In an attempt to enhance transferability further, we optimized the parameters of a few augmentation methods we considered via grid search. Except for the Gaussian kernel's size used in translationinvariant attacks (Dong et al., 2019) , we found that the selected parameters had little impact on Under review as a conference paper at ICLR 2023 transferability. Specifically, for translations, after considering Gaussian kernels of sizes ∈ {5 × 5, 7 × 7, 9 × 9}, we set the default to 7 × 7, except for Admix-DT-MI-FGSM, for which the 9 × 9 kernel performed best. Table 8 shows that our choice of Admix parameters (m=1 and Gaussian kernel size of 9 × 9) improves its performance. For GS, we found ω R , ω G , and ω B had little impact on transferability, as long as the weight assigned to each channel was >0.1. Accordingly, we set ω R , ω G , and ω B to 0.299, 0.587, and 0.114, respectively, per commonly used values (e.g., in the Python PyTorch packagefoot_3 ). Finally, for CS, we only swapped the blue and green channels, as this led to a minor improvement compared to swapping all three channels. Table 8 : Transferability rates (%) of AEs crafted via Admix-DT-MI-FGSM against an Inc-v3 surrogate. Our variant sets m=1 and the translation's Gaussian kernel to 7 × 7 and include original images when calculating gradients, whereas the original work uses m=3 and a 9 × 9 kernel. Finally, we clarify that each of our attack combinations emits the original image once, alongside the transformed images. Moreover, when aggregating the gradients, the gradients of the original and transformed images are assigned equal weights. We tested whether weighting the gradients differently (e.g., assigning higher or lower weight to the original sample) can help improve transferability using the GS method. However, we found that equal weights attained the best results.

D ATTACKS CONSIDERED BUT EXCLUDED

Besides the three state-of-the-art baselines we experimented with, we considered including two other attacks in the evaluation. Wu et al.'s (2021) attack uses a neural network to create adversarial perturbations robust against transformations for enhanced transferability, and achieves competitive transferability rates. However, unfortunately, we were unable to find a publicly available implementation of the attack. Huang et al.'s (2019) intermediate level attack improve AE transferability by reducing the variance of intermediate activations. We used the official implementationfoot_4 to test the attack on CIFAR-10 with =0.02 and the VGG or DenseNet models as surrogates. The results showed that the transferability rates were much less competitive that the three baselines we considered (50.34% vs. >54.00% average transferability with a VGG surrogate, and 45.68% vs. >56.56% average transferability with a DenseNet surrogate). Therefore, we removed the intermediate level attack for the remaining experiments.

E INDIVIDUAL AUGMENTATIONS

Table 9 presents the transferability rates when integrating individual augmentation methods into MI-FGSM. Table 10 presents the transferability when composing individual augmentation methods with DST. Trasnferability rates were computed on ImageNet, using the Inc-v3 DNN as a surrogate and the other normally trained DNNs as victims ( = 16 255 ). Notice how composing color-space augmentations (specifically, CS, CJ, and GS) with DST helps improve transferability over the baselines (Table 10 ).

F TRANSFERABILITY RATES ON IMAGENET

Tables 11 and 12 detail the trasferability rates on ImageNet, from all ten DNNs to normally and adversarially trained models, respectively. Here, we also consider transferring AEs from adversarially trained surrogates. Table 13 shows the transferability rates on ImageNet from Inc-v3 to other normally traiend models with varied perturbation norms (i.e., values of ).

G TRANSFERABILITY RATES ON CIFAR-10

Tables 14 and 15 report attack tranferability rates from all six normally trained DNNs to all other victim DNNs for =0.02 and =0.04, respectively.

H THE MONOTONICITY OF TRANSFERABILITY WHEN ADDING AUGMENTATIONS

Figure 2 depicts a visual summary of the experiment presented in Section 5.2, demonstrating how the relationship between augmentation methods and transferability is mostly monotonic.

I ATTACK RUN-TIME

MI-FGSM's time complexity is predominated by the gradient computation steps. Accordingly, the attacks' run-times are directly affected by the number of samples the augmentation methods create (i.e., samples emitted by D(•) in Algorithm 1): The more samples emitted by the augmentation method, the more back-propagation would be required to compute gradients for updating the adversarial examples in each iteration, thus increasing the AE-generation time. The empirical measurements corroborate this intuition (Table 16 ). Overall, we can see that DST augments MI-FGSM with the least samples, leading to the fastest attack (DST-MI-FGSM). GS-DST-MI-FGSM is the second fastest attack, while ULTIMATECOMBO is slower than Admix-DT-MI-FGSM but substantially faster than DST-VMI-FGSM. We note that no particular effort was invested to make GS-DST-MI-FGSM and ULTIMATECOMBO more time-efficient (e.g., stacking augmented samples for parallel computation, similarly to Admix-DT-MI-FGSM). Moreover, since transferability-based attacks generate AEs offline, and only once per surrogate model, as long as an attack is not prohibitively slow, attack run-time is a marginal consideration for selecting an attack compared to transferability rates.

J DEFENSE PARAMETERS

We used standard parameters when attacking defenses. For RS, we used a normally trained ResNet-50 and set σ to 0.25, following Cohen et al. (2019) . For ARS, the target model was ResNet-50 trained with isotropic Gaussian-noise augmentations (sampled from N (0, 0.25)), and σ was set to 0.25 during prediction, per Salman et al. (2019) . In both cases, we used 10,000 noisified samples during inference. For Bit-Red, we used a squeezer with bit-depth of one, in accordance with Xu et al. (2018) . Finally, we used the default NRP parameters and pre-trained model from the official GitHub repository (Naseer et al., 2020) . Table 16 : The number of samples augmented and the average time of crafting an AE (seconds per images) for different attacks. Times were measured on ImageNet, while attacking an Inc-v3 surrogate, and averaged for 1,000 samples. Experiments were executed on an Nvidia A5000 GPU. Figure 2 : Best viewed after zooming in. Each node represents a composition of augmentation methods. The binary string within a node encodes the composition: Each bit, from the most to the least significant, denotes whether Admix (most significant bit), GS, CutOut, AutoAugment, DST-MI-FGSM, NeuTrans, and Sharp (least significant bit), respectively, is included (1) or excluded (0) from the composition. An edge from node u to v is included if v includes exactly one more augmentation method compared to u. As explained in Section 5.2, we computed the average transferability rate per composition on ImageNet, from an Inc-v3 surrogate DNN to the remaining nine models as victim models. An edge (u, v) is colored in black (resp. red) if v's composition achieves higher or equal (resp. lower) average transferability rates than u when integrated into MI-FGSM. We faded away nodes containing NeuTrans or Sharp and their corresponding edges. Notice how all the remaining (unfaded) edges are black, showing that the relationship between the (average) transferability and the remaining augmentations is monotonic (i.e., more augmentations composed → ≥transferability). Said differently, only NeuTrans and Sharp harm transferability in some cases.



https://bit.ly/3fq4pN6 https://github.com/ylhz/tf_to_pytorch_model https://github.com/huyvnphan/PyTorch_CIFAR10 https://bit.ly/3ynCyUD https://github.com/CUAI/Intermediate-Level-Attack



Figure1: An illustration of serial and parallel compositions. When serially composing augmentations, each augmentation method operates on the output of the previous one. By contrast, in parallel composition, each augmentation method operates independently on the input (or set of inputs). The number of grows in serial composition, whereas it grows linearly in parallel composition. We use serial composition when composing diverse inputs (DI), scaling (Sc.), and translations (Tr.). Other augmentation methods are composed in parallel.

loss gradient on augmented samples

Transferability rates (%) on ImageNet, from normally trained surrogates (rows) to normally trained targets (columns). All attacks are black-box, except for when the surrogate and target models are the same. MAXBASELINE is the best performing of the three baselines.

Transferability rates (%) on ImageNet, from normally trained surrogates (rows) to adversarially trained targets (columns). MAXBASELINE is the best performing of the three baselines.



and randomized smoothing with adversarial training (ARS)(Salman et al., 2019) offer provable robustness guarantees. Finally, TRS leverages an ensemble of smooth DNNs trained to have misaligned gradients, to defend attacks(Yang et al.,  2021). We evaluated all defenses except for TRS on ImageNet. We used the defenses with default parameters (see Appendix J), and transferred AEs crafted against an ensemble of normally trained models. Results are shown in Table4. Similar to other settings, here too, ULTIMATECOMBO outperformed the baselines (66.8% vs. ≤63.9% avg. transferability). Following Yang et al. (2021), we tested TRS on CIFAR-10 with adversarial perturbation norms ∈ {0.02, 0.04}. ULTIMATECOMBO did best against this defense as well (Table5).



Benign accuracy (%) after applying data augmentation methods. Rows are sorted in a descending order of average transferability.



Transferability rates (%) on ImageNet from a normally trained Inc-v3 surrogate to normally trained target models (columns) when integrating individual augmentation methods into MI-FGSMbased attacks.

Transferability rates (%) on ImageNet from ten surrogates (rows) to normally trained target models (columns). All attacks are black-box, except for when the surrogate and target models are the same.

Transferability rates (%) on ImageNet, from a Inc-v3 surrogate to other normally trained models, with perturbation norms ∈ { 8 255 , 24 255 } other than the default = 16 255 .

Transferability rates (%) on CIFAR-10, from normally trained surrogates (rows) to normally trained target models (columns), with a perturbation norm =0.02.

Transferability rates (%) on CIFAR-10, from normally trained surrogates (rows) to normally trained target models (columns), with a perturbation norm =0.04.

REPRODUCIBILITY STATEMENT

In the interest of reproducibility, we make our code publicly available at the following repository: https://tinyurl.com/UltimateComboICLR. Xiaosen Wang, Xuanran He, Jingdong Wang, and Kun He. Admix: Enhancing the transferability of adversarial attacks. In Proc. ICCV, 2021a. A PARALLEL AND SERIAL COMPOSITIONS OF AUGMENTATIONS Figure 1 illustrates how parallel and serial compositions work.

B DNNS USED IN THE EXPERIMENTS

We tested transferability using ten ImageNet DNNs and six CIFAR-10 DNNs. Of the ten ImageNet models, six were normally trained, while others were adversarially trained. Specifically, for normally trained models, we selected: Inception-v3 (Inc-v3) (Szegedy et al., 2016) ; Inception-v4 (Inc-v4); Inception-ResNet-v2 (IncRes-v2) (Szegedy et al., 2017) ); ResNet-v2-50 (Res-50); ResNet-v2-101 (Res-101); and ResNet-v2-152 (Res-152) (He et al., 2016) . For adversarially trained models, we selected: Inception-v3-adv (Inc-v3 adv ) (Kurakin et al., 2017a) ; ens3-Inception-v3 (Inc-v3 ens3 ); ens4-Inception-v3 (Inc-v3 ens4 ); and ens-adv-Inception-ResNet-v2 (IncRes-v2 ens ) (Tramèr et al., 2018) . We obtained the models' PyTorch implementations and weights from a public GitHub repository. 2 All six CIFAR-10 DNNs were normally trained. For this dataset, we used pretrained VGG-11 (VGG) (Simonyan & Zisserman, 2015) ), ResNet-50 (Res) (He et al., 2016) , DenseNet-121 (DenseNet) (Huang et al., 2017) , MobileNet-v2 (MobileNet) (Sandler et al., 2018) , GoogleNet (Szegedy et al., 2015) , and an Inception-v3 (Inc) DNNs (Szegedy et al., 2016) , also implemented in PyTorch. 3 

