TESTING ROBUSTNESS AGAINST UNFORESEEN AD-VERSARIES

Abstract

Most existing adversarial defenses only measure robustness to L p adversarial attacks. Not only are adversaries unlikely to exclusively create small L p perturbations, adversaries are unlikely to remain fixed. Adversaries adapt and evolve their attacks; hence adversarial defenses must be robust to a broad range of unforeseen attacks. We address this discrepancy between research and reality by proposing a new evaluation framework called ImageNet-UA. Our framework enables the research community to test ImageNet model robustness against attacks not encountered during training. To create ImageNet-UA's diverse attack suite, we introduce a total of four novel adversarial attacks. We also demonstrate that, in comparison to ImageNet-UA, prevailing L ∞ robustness assessments give a narrow account of adversarial robustness. By evaluating current defenses with ImageNet-UA, we find they provide little robustness to unforeseen attacks. We hope the greater variety and realism of ImageNet-UA enables development of more robust defenses which can generalize beyond attacks seen during training. Adversarial robustness is notoriously difficult to correctly evaluate (Papernot et al., 2017; Athalye et al., 2018a). To that end, Carlini et al. (2019a) provide extensive guidance for sound adversarial robustness evaluation. By measuring attack success rates across several distortion sizes and using a broader threat model with diverse differentiable attacks, ImageNet-UA has several of their recommendations built-in, while greatly expanding the set of attacks over previous work on evaluation. Under review as a conference paper at ICLR 2021 Our attacks create an adversarial image x from a clean image x with true label y. Let model f map images to a softmax distribution, and let (f (x), y) be the cross-entropy loss. Given a target class y = y, our attacks attempt to find a valid image x such that (1) the attacked image x is obtained by applying a distortion (of size controlled by a parameter ε) to x, and (2) the loss (f (x ), y ) is minimized. An unforeseen adversarial attack is a white-or black-box adversarial attack unknown to the defense designer which does not change the true label of x according to an oracle or human. JPEG. JPEG applies perturbations in a JPEG-encoded space of compressed images rather than raw pixel space. More precisely, JPEG compression is a linear transform JPEG which applies colorspace conversion, the discrete cosine transform, and then quantization. Our JPEG attack imposes the L ∞ -constraint JPEG(x) -JPEG(x ) ∞ ≤ ε on the attacked image x . We optimize z = JPEG(x ) under this constraint to find an adversarial perturbation in the resulting frequency space. The perturbed frequency coefficients are quantized, and we then apply a right-inverse of JPEG to obtain the attacked image x in pixel space. We use ideas from Shin & Song (2017) to make this differentiable. The resulting attack is conspicuously distinct from L p attacks. Fog. Fog simulates worst-case weather conditions. Robustness to adverse weather is a safety critical priority for autonomous vehicles, and Figure 2 shows Fog provides a more rigorous stress-test than stochastic fog (Hendrycks & Dietterich, 2019) . Fog creates adversarial fog-like occlusions by adversarially optimizing parameters in the diamond-square algorithm (Fournier et al., 1982) typically used to render stochastic fog effects.

1. INTRODUCTION

Neural networks perform well on many datasets (He et al., 2016) yet can be consistently fooled by minor adversarial distortions (Goodfellow et al., 2014) . The research community has responded by quantifying and developing adversarial defenses against such attacks (Madry et al., 2017) , but these defenses and metrics have two key limitations. First, the vast majority of existing defenses exclusively defend against and quantify robustness to L p -constrained attacks (Madry et al., 2017; Cohen et al., 2019; Raff et al., 2019; Xie et al., 2018) . Though real-world adversaries are not L p constrained (Gilmer et al., 2018) and can attack with diverse distortions (Brown et al., 2017; Sharif et al., 2019) , the literature largely ignores this and evaluates against the L p adversaries already seen during training (Madry et al., 2017; Xie et al., 2018) , resulting in optimistic robustness assessments. The attacks outside the L p threat model that have been proposed (Song et al., 2018; Qiu et al., 2019; Engstrom et al., 2017; Evtimov et al., 2017; Sharif et al., 2016) are not intended for general defense evaluation and suffer from narrow dataset applicability, difficulty of optimization, or fragility of auxiliary generative models. Second, existing defenses assume that attacks are known in advance (Goodfellow, 2019) and use knowledge of their explicit form during training (Madry et al., 2017) . In practice, adversaries can deploy unforeseen attacks not known to the defense creator. For example, online advertisers use attacks such as perturbed pixels in ads to defeat ad blockers trained only on the previous generation of ads in an ever-escalating arms race (Tramèr et al., 2018) . However, current evaluation setups implicitly assume that attacks encountered at test-time are the same as those seen at train-time, which is unrealistic. The reality that future attacks are unlike those encountered during training is akin to a train-test distribution mismatch-a problem studied outside of adversarial robustness (Recht et al., 2019; Hendrycks & Dietterich, 2019) -but now brought to the adversarial setting. The present work addresses these limitations by proposing an evaluation framework ImageNet-UA to measure robustness against unforeseen attacks. ImageNet-UA assesses a defense which may have been created with knowledge of the commonly used L ∞ or L 2 attacks with six diverse attacks (four of which are novel) distinct from L ∞ or L 2 . We intend these attacks to be used at test-time only and not during training. Performing well on ImageNet-UA thus demonstrates generalization to a diverse set of distortions not seen during defense creation. While ImageNet-UA Previous Attacks L ∞ L 2 L 1 Elastic Our New Attacks JPEG Fog Snow Gabor Figure 1 : Adversarially distorted chow chow dog images created with old attacks and our new attacks. The JPEG, Fog, Snow, and Gabor adversarial attacks are visually distinct from previous attacks, result in distortions which do not obey a small L p norm constraint, and serve as unforeseen attacks for the ImageNet-UA attack suite. does not provide an exhaustive guarantee over all conceivable attacks, it evaluates over a diverse unforeseen test distribution similar to those used successfully in other studies of distributional shift (Rajpurkar et al., 2018; Hendrycks & Dietterich, 2019; Recht et al., 2019) . ImageNet-UA works for ImageNet models and can be easily used with our code available at https://github.com/ anon-submission-2020/anon-submission-2020. Designing ImageNet-UA requires new attacks that are strong and varied, since real-world attacks are diverse in structure. To meet this challenge, we contribute four novel and diverse adversarial attacks which are easily optimized. Our new attacks produce distortions with occlusions, spatial similarity, and simulated weather, all of which are absent in previous attacks. Performing well on ImageNet-UA thus demonstrates that a defense generalizes to a diverse set of distortions distinct from the commonly used L ∞ or L 2 . With ImageNet-UA, we show weaknesses in existing evaluation practices and defenses through a study of 8 attacks against 48 models adversarially trained on ImageNet-100, a 100-class subset of ImageNet. While most adversarial robustness evaluations use only L ∞ attacks, ImageNet-UA reveals that models with high L ∞ attack robustness can remain susceptible to other attacks. Thus, L ∞ evaluations are a narrow measure of robustness, even though much of the literature treats this evaluation as comprehensive (Madry et al., 2017; Qian & Wegman, 2019; Schott et al., 2019; Zhang et al., 2019) . We address this deficiency by using the novel attacks in ImageNet-UA to evaluate robustness to a more diverse set of unforeseen attacks. Our results demonstrate that L ∞ adversarial training, the current state-of-the-art defense, has limited generalization to unforeseen adversaries, and is not easily improved by training against more attacks. This adds to the evidence that achieving robustness against a few train-time attacks is insufficient to impart robustness to unforeseen test-time attacks (Jacobsen et al., 2019; Jordan et al., 2019; Tramèr & Boneh, 2019) . In summary, we propose the framework ImageNet-UA to measure robustness to a diverse set of attacks, made possible by our four new adversarial attacks. Since existing defenses scale poorly to multiple attacks (Jordan et al., 2019; Tramèr & Boneh, 2019) , finding defense techniques which generalize to unforeseen attacks is crucial to create robust models. We suggest ImageNet-UA as a way to measure progress towards this goal. Figure 2 : Randomly sampled distortions and adversarially optimized distortions from our new attacks, targeted to the target class in red. Stochastic average-case versions of our attacks affect classifiers minimally, while adversarial versions are optimized to reveal high-confidence errors. The snowflakes in Snow decrease in intensity after optimization, demonstrating that lighter adversarial snowflakes are more effective than heavy random snowfall at uncovering model weaknesses. We are only aware of a few prior works which evaluate on unforeseen attacks in specific limited circumstances. Wu et al. (2020) evaluate against physically-realizable attacks from Evtimov et al. (2017) and Sharif et al. (2016) , though this limits the threat model to occlusion attacks on narrow datasets. Outside of vision, Pierazzi et al. (2020) proposes constraining attacks by a more diverse set of problem-space constraints in diverse domains such as text and malware or source code generation; however, even in this framework, analytically enumerating all such constraints is impossible. Within vision, prior attacks outside the L p threat model exist, but they lack the general applicability and fast optimization of ours. 

3. NEW ATTACKS FOR A BROADER THREAT MODEL

There are few diverse, easily optimizable, plug-and-play adversarial attacks in the current literature; outside of Elastic (Xiao et al., 2018) , most are L p attacks such as L ∞ (Goodfellow et al., 2014) , L 2 (Szegedy et al., 2013; Carlini & Wagner, 2017) , L 1 (Chen et al., 2018) . We rectify this deficiency with four novel adversarial attacks: JPEG, Fog, Snow, and Gabor. Our attacks are differentiable and fast, while optimizing over enough parameters to be strong. We show example adversarial images in Figure 1 and compare stochastic and adversarial distortions in Figure 2 . Our novel attacks provide a range of test-time adversaries visually and semantically distinct from L ∞ and L 2 attacks. Namely, they cause distortions with large L ∞ and L 2 norm, but result in images that are perceptually close to the original. These attacks are intended as unforeseen attacks not used during training, allowing them to evaluate whether a defense can generalize from L ∞ or L 2 to a more varied set of distortions than current evaluations. Though our attacks are not exhaustive, performing well against them already demonstrates robustness to occlusion, spatial similarity, and simulated weather, which are absent from previous evaluations. This algorithm starts with random perturbations to the four corner pixels of the image. At step t, it iteratively perturbs pixels at the centers of squares and diamonds formed by those pixels perturbed at step t -1. The perturbation of a step t pixel is the average of the neighboring step t -1 perturbations plus a parameter value which we adversarially optimize. We continue this process until all pixels have been perturbed; the outcome is a fog-like distortion to the original image.

Snow.

Snow simulates snowfall with occlusions of randomly located small image regions representing snowflakes. Because the distortions caused by snowflakes are not differentiable in their locations, we instead place occlusions representing snowflakes at randomly chosen locations and orientations and adversarially optimize their intensities. This choice results in a fast, differentiable, and strong attack. Compared to synthetic stochastic snow (Hendrycks & Dietterich, 2019) , our adversarial snow is faster and includes snowflakes at differing angles. Figure 2 shows adversarial snow exposes model weaknesses more effectively than the easier stochastic, average-case snow. Gabor. Gabor spatially occludes the image with visually diverse Gabor noise Lagae et al. (2009) . Gabor noise is a form of band-limited anisotropic procedural noise which convolves a parameter mask with a Gabor kernel which is a product of a Gaussian kernel and a harmonic kernel. We choose the Gabor kernel randomly and adversarially optimize the parameters of the mask starting from a sparse initialization. We apply spectral variance normalization (Co et al., 2019) to the resulting distortion and add it to the input image to create the attack.

3.2. IMPROVING EXISTING ATTACKS

Elastic modifies the attack of Xiao et al. (2018) ; it warps the image by distortions x = Flow(x, V ), where V : {1, . . . , 224} 2 → R 2 is a vector field on pixel space, and Flow sets the value of pixel (i, j) to the bilinearly interpolated original value at (i, j) + V (i, j). We construct V by smoothing a vector field W by a Gaussian kernel (size 25 × 25, σ ≈ 3 for a 224 × 224 image) and optimize W under W (i, j) ∞ ≤ ε for all i, j. The resulting attack is suitable for large-scale images. The other three attacks are L 1 , L 2 , L ∞ attacks, but we improve the L 1 attack. For L ∞ and L 2 constraints, we use randomly-initialized projected gradient descent (PGD), which applies gradient descent and projection to the L ∞ and L 2 balls (Madry et al., 2017) . Projection is difficult for L 1 , and previous L 1 attacks rely on computationally intensive methods for it (Chen et al., 2018 We propose the framework ImageNet-UA and its CIFAR-10 analogue CIFAR-10-UA to measure and summarize model robustness while fulfilling the following desiderata: (1) defenses should be evaluated against a broad threat model through a diverse set of attacks, (2) defenses should exhibit generalization to attacks not exactly identical to train-time attacks, and (3) the range of distortion sizes used for an attack must be wide enough to avoid misleading conclusions caused by overly weak or strong versions of that attack (Figure 3 ). The ImageNet-UA evaluation framework aggregates robustness information into a single measure, the mean Unforeseen Adversarial Robustness (mUAR). The mUAR is an average over six different attacks of the Unforeseen Adversarial Robustness (UAR), a metric which assesses the robustness of a defense against a specific attack by using a wide range of distortion sizes. UAR is normalized using a measure of attack strength, the ATA, which we now define. Adversarial Training Accuracy (ATA). The Adversarial Training Accuracy ATA(A, ε) estimates the strength of an attack A against adversarial training (Madry et al., 2017) , one of the strongest known defense methods. For a distortion size ε, it is the best adversarial test accuracy against A achieved by adversarial training against A. We allow a possibly different distortion size ε during training, since this can improves accuracy, and we choose a fixed architecture for each dataset. For ImageNet-100, we choose ResNet-50 for the architecture, and for CIFAR-10 we choose ResNet-56. When evaluating a defense with architecture other than ResNet-50 or ResNet-56, we recommend using ATA values computed with these architectures to enable consistent comparison. To estimate ATA(A, ε) in practice, we evaluate models adversarially trained against distortion size ε for ε in a large range (we describe this range at this section's end). UAR: Robustness Against a Single Attack. The UAR, a building block for the mUAR, averages a model's robustness to a single attack over six distortion sizes ε 1 , . . . , ε 6 chosen for each attack (we describe the selection procedure at the end of this section). It is defined as UAR(A) := 100 × 6 k=1 Acc(A, ε k , M ) 6 k=1 ATA(A, ε k ) , where Acc(A, ε k , M ) is the accuracy Acc(A, ε k , M ) of a model M after attack A at distortion size ε k . The normalization in (1) makes attacks of different strengths more commensurable in a stable way. We give values of ATA(A, ε k ) and ε k for our attacks on ImageNet-100 and CIFAR-10 in Tables 4 and 5 (Appendix B), allowing computation of UAR of a defense against a single attack with six adversarial evaluations and no adversarial training. mUAR: Mean Unforeseen Attack Robustness. We summarize a defense's performance on ImageNet-UA with the mean Unforeseen Attack Robustness (mUAR), an average of UAR scores for the L 1 , Elastic, JPEG, Fog, Snow, and Gabor attacks: L ∞ L 2 L 1 J P E G E la s t ic F o g S n mUAR := 1 6 UAR(L 1 )+UAR(Elastic)+UAR(JPEG)+UAR(Fog)+UAR(Snow)+UAR(Gabor) . Our measure mUAR estimates robustness to a broad threat model containing six unforeseen attacks at six distortion sizes each, meaning high mUAR requires generalization to several held-out attacks. In particular, it cannot be achieved by the common practice of engineering defenses to a single attack, which Figure 4 shows does not necessarily provide robustness to different attacks. Our four novel attacks play a crucial role in mUAR by allowing us to estimate robustness to a sufficiently large set of adversarial attacks. As is customary when studying train-test mismatches and distributional shift, we advise against adversarially training with these six attacks when evaluating ImageNet-UA to preserve the validity of mUAR, though we encourage training with other attacks. Distortion Sizes. We explain the ε values used to estimate ATA and the choice of ε 1 , . . . , ε 6 used to define UAR. This calibration of distortion sizes adjusts for the fact (Figure 3 ) that adversarial robustness against an attack may vary drastically with distortion size. Further, the relation between distortion size and attack strength varies between attacks, so too many or too few ε k values in a certain range may cause an attack to appear artificially strong or weak according to UAR. We choose distortion sizes between ε min and ε max as follows. The minimum distortion size ε min is the largest ε for which the adversarial accuracy of an adversarially trained model at distortion size ε is comparable to that of a model trained and evaluated on unattacked data (for ImageNet-100, within 3 of 87). The maximum distortion size ε max is the smallest ε which either reduces adversarial accuracy of an adversarially trained model at distortion size ε below 25 or yields images confusing humans (adversarial accuracy can remain non-zero in this case). As is typical in recent work on adversarial examples (Athalye et al., 2018b; Evtimov et al., 2017; Dong et al., 2019; Qin et al., 2019) , our attacks can be perceptible at large distortion sizes. We make this choice to reflect perceptibility of attacks in real world threat models per Gilmer et al. (2018) . For ATA, we evaluate against models adversarially trained with ε increasing geometrically from ε min to ε max by factors of 2. We then choose ε k as follows: We compute ATA at ε increasing geometrically from ε min to ε max by factors of 2 and take the size-6 subset whose ATA values have minimum 1 -distance to the ATA values of the L ∞ attack in Table 4 (Appendix B.1). For example, for Gabor, (ε min , ε max ) = (6.25, 3200), so we compute ATAs at the 10 values ε = 6.25, . . . , 3200. Viewing size-6 subsets of the ATAs as vectors with decreasing coordinates, we select ε k for Gabor corresponding to the vector with minimum 1 -distance to the ATA vector for L ∞ . 68.4 81.5 87.9 50.9

5. NEW INSIGHTS FROM ImageNet-UA

We use ImageNet-UA to assess existing methods for adversarial defense and evaluation. First, ImageNet-UA reveals that L ∞ trained defenses fail to generalize to different attacks, indicating substantial weakness in current L ∞ adversarial robustness evaluation. We establish a baseline for ImageNet-UA using L 2 adversarial training which is difficult to improve upon by adversarial training alone. Finally, we show non-adversarially trained models can still improve robustness on ImageNet-UA over standard models and suggest this as a direction for further inquiry.

5.1. EXPERIMENTAL SETUP

We adversarially train 48 models against the 8 attacks from Section 3 and evaluate against targeted attacks. We use the CIFAR-10 and ImageNet-100 datasets for ImageNet-UA and CIFAR-10-UA. ImageNet-100 is a 100-class subset of ImageNet-1K (Deng et al., 2009) containing every tenth class by WordNet ID order; we use a subset of ImageNet-1K due to the high compute cost of adversarial training. We use ResNet-56 for CIFAR-10 and ResNet-50 from torchvision for ImageNet-100 (He et al., 2016) . We provide training hyperparameters in Appendix A. To adversarially train against attack A, at each mini-batch we select a uniform random (incorrect) target class for each training image. For maximum distortion size ε, we apply targeted attack A to the current model with distortion size ε ∼ Uniform(0, ε) and take a SGD step using only the attacked images. Randomly scaling ε improves performance against smaller distortions. We train on 10-step attacks for attacks other than Elastic, where we use 30 steps due to a harder optimization. For L p , JPEG, and Elastic, we use step size ε/ √ steps; for Fog, Gabor, and Snow, we use step size 0.001/steps because the latent space is independent of ε. These choices have optimal rates for non-smooth convex functions (Nemirovski & Yudin, 1978; 1983) . We evaluate on 200-step targeted attacks with uniform random (incorrect) target, using more steps for evaluation than training per best practices (Carlini et al., 2019b) . Figure 4 summarizes ImageNet-100 results. Full results for ImageNet-100 and CIFAR-10 are in Appendix E and robustness checks to random seed and attack iterations are in Appendix F.

5.2. ImageNet-UA REVEALS WEAKNESSESS IN L ∞ TRAINING AND TESTING

We use ImageNet-UA to reveal weaknesses in the common practices of L ∞ robustness evaluation and L ∞ adversarial training. We compute the mUAR and UAR(L ∞ ) for models trained against the L ∞ attack with distortion size ε and show results in Figure 5 . For small ε ≤ 4, mUAR and Table 3 : Non-adversarial defenses can noticeably improve ImageNet-UA performance. ResNeXt-101 (32×8d) + WSL is trained on approximately 1 billion images Mahajan et al. (2018) . Stylized ImageNet is trained on a modification of ImageNet using style transfer Geirhos et al. (2019) . Patch Gaussian augments using Gaussian distortions on small portions of the image Lopes et al. (2019) . AugMix mixes simple random augmentations of the image Hendrycks et al. (2020) . These results suggest that ImageNet-UA performance may be achieved through non-adversarial defenses. UAR(L ∞ ) increase together with ε. For larger ε ≥ 8, UAR(L ∞ ) continues to increase with ε, but the mUAR decreases, a fact which is not apparent from L ∞ evaluation. The decrease in mUAR while UAR(L ∞ ) increases suggests that L ∞ adversarial training begins to heavily fit L ∞ distortions at the expense of generalization at larger distortion sizes. Thus, while it is the most commonly used defense procedure, L ∞ training may not lead to improvements on other attacks or to real-world robustness. Worse, L ∞ evaluation against L ∞ adversarial training at higher distortions indicates higher robustness. In contrast, mUAR reveals that L ∞ adversarial training at higher distortions in fact hurts robustness against a more diverse set of attacks. Thus, L ∞ evaluation gives a misleading picture of robustness. This is particularly important because L ∞ evaluation is the most ubiquitous measure of robustness in deep learning (Goodfellow et al., 2014; Madry et al., 2017; Xie et al., 2018) .

5.3. LIMITS OF ADVERSARIAL TRAINING FOR ImageNet-UA

We establish a baseline on ImageNet-UA using L 2 adversarial training but show a significant performance gap even for more sophisticated existing adversarial training methods. To do so, we evaluate several adversarial training methods on ImageNet-UA and show results in Table 1 . Our results show that L 2 trained models outperform L ∞ trained models and have significantly improved absolute performance, increasing mUAR from 14.0 to 50.7 compared to an undefended model. The individual UAR values in Figure 7 (Appendix E.1) improve substantially against all attacks other than Fog, including several (Elastic, Gabor, Snow) of extremely different nature to L 2 . This result suggests pushing adversarial training further by training against multiple attacks simultaneously via joint adversarial training (Jordan et al., 2019; Tramèr & Boneh, 2019) detailed in Appendix C. Table 2 shows that, despite using twice the compute of L 2 training, (L ∞ , L 2 ) joint training only improves the mUAR from 50.7 to 50.9. We thus recommend L 2 training as a baseline for ImageNet-UA, though there is substantial room for improvement compared to the highest UARs against individual attacks in Figure 4 , which are all above 80 and often above 90.

5.4. ImageNet-UA ROBUSTNESS THROUGH NON-ADVERSARIAL DEFENSES

We find that methods can improve robustness to unforeseen attacks without adversarial training. Table 3 shows mUAR for SqueezeNet (Iandola et al., 2017) , ResNeXts (Xie et al., 2016) , and ResNets. For ImageNet-1K models, we mask 900 logits to predict ImageNet-100 classes. A popular defense against average case distortions (Hendrycks & Dietterich, 2019) is Stylized Im-ageNet (Geirhos et al., 2019) , which modifies training images using image style transfer in hopes of making networks rely less on textural features. (Hendrycks & Dietterich, 2019; Hendrycks et al., 2019) where more data helps tremendously (Orhan, 2019) . While models with lower clean accuracy (e.g., SqueezeNet and ResNet-18) have higher UAR(L ∞ ) and UAR(L 2 ) than many other models, there is no clear difference in mUAR. Last, these non-adversarial defenses have minimal cost to accuracy on clean examples, unlike adversarial defenses. Much remains to explore, and we hope non-adversarial defenses will be a promising avenue toward adversarial robustness.

6. CONCLUSION

This work proposes a framework ImageNet-UA to evaluate robustness of a defense against unforeseen attacks. Because existing adversarial defense techniques do not scale to multiple attacks, developing models which can defend against attacks not seen at train-time is essential for robustness. Our results using ImageNet-UA show that the common practice of L ∞ training and evaluation fails to achieve or measure this broader form of robustness. As a result, it can provide a misleading sense of robustness. By incorporating our 4 novel and strong adversarial attacks, ImageNet-UA enables evaluation on the diverse held-out attacks necessary to measure progress towards robustness more broadly. A TRAINING HYPERPARAMETERS For ImageNet-100, we trained on machines with 8 NVIDIA V100 GPUs using standard data augmentation (He et al., 2016) . Following best practices for multi-GPU training (Goyal et al., 2017) , we ran synchronized SGD for 90 epochs with batch size 32×8 and a learning rate schedule with 5 "warm-up" epochs and a decay at epochs 30, 60, and 80 by a factor of 10. Initial learning rate after warm-up was 0.1, momentum was 0.9, and weight decay was 10 -4 . For CIFAR-10, we trained on a single NVIDIA V100 GPU for 200 epochs with batch size 32, initial learning rate 0.1, momentum 0.9, and weight decay 10 -4 . We decayed the learning rate at epochs 100 and 150. B CALIBRATION OF ImageNet-UA AND CIFAR-10-UA The ε calibration procedure for CIFAR-10 was similar to that used for ImageNet-100. We started with small ε min values and increased ε geometrically with ratio 2 until adversarial accuracy of an adversarially trained model dropped below 40. Note that this threshold is higher for CIFAR-10 than ImageNet-100 because there are fewer classes. The resulting ATA values for CIFAR-10 are shown in Table 5 .

C JOINT ADVERSARIAL TRAINING

Our joint adversarial training procedure for two attacks A and A is as follows. At each training step, we compute the attacked image under both A and A and backpropagate with respect to gradients induced by the image with greater loss. This corresponds to the "max" loss of Tramèr & Boneh (2019) . We train ResNet-50 models for (L ∞ , L 2 ), (L ∞ , L 1 ), and (L ∞ , Elastic) on ImageNet-100. Table 6 shows training against (L ∞ , L 1 ) is worse than training against L 1 at the same distortion size and performs particularly poorly at large distortion sizes.  UAR L∞ UAR L1 L ∞ ε = 2, L 1 ε = 76500 48 66 L ∞ ε = 4, L 1 ε = 153000 51 72 L ∞ ε = 8, L 1 ε = 306000 44 62 L 1 ε = 76500 50 70 L 1 ε = 153000 54 81 L 1 ε = 306000 59 87 Table 7 : UAR scores for L ∞ -and Elastic-trained models and (L ∞ , Elastic)-jointly trained models. No jointly trained model matches a Elastic-trained model on UAR vs. Elastic. UAR L∞ UAR Elastic L ∞ ε = 4, Elastic ε = 2 68 63 L ∞ ε = 8, Elastic ε = 4 35 65 L ∞ ε = 16, Elastic ε = 8 69 43 Elastic ε = 2 37 68 Elastic ε = 4 36 81 Elastic ε = 8 31 91 (L ∞ , Elastic) also performs poorly, never matching the UAR score of training against Elastic at moderate distortion size (ε = 2).

D THE FRANK-WOLFE ALGORITHM

We chose to use the Frank-Wolfe algorithm for optimizing the L 1 attack, as Projected Gradient Descent would require projecting onto a truncated L 1 ball, which is a complicated operation. In contrast, Frank-Wolfe only requires optimizing linear functions g x over a truncated L 1 ball; this can be done by sorting coordinates by the magnitude of g and moving the top k coordinates to the boundary of their range (with k chosen by binary search). This is detailed in Algorithm 1.

E.1 FULL EVALUATION RESULTS AND ANALYSIS FOR IMAGENET-100

We show the full results of all adversarial attacks against all adversarial defenses for ImageNet-100 in Figure 6 . These results also include L 1 -JPEG and L 2 -JPEG attacks, which are modifications of the JPEG attack applying L p -constraints in the compressed JPEG space instead of L ∞ constraints. Full UAR scores are provided for ImageNet-100 in Figure 7 .

E.2 FULL EVALUATION RESULTS AND ANALYSIS FOR CIFAR-10

We show the results of adversarial attacks and defenses for CIFAR-10 in Figure 8 . We experienced difficulty training the L 2 and L 1 attacks at distortion sizes greater than those shown and have omitted those runs, which we believe may be related to the small size of CIFAR-10 images. Full UAR values for CIFAR-10 are shown in Figure 9 . F ROBUSTNESS OF OUR RESULTS

F.1 REPLICATION

We replicated our results for the first three rows of Figure 6 with different random seeds to see the variation in our results. As shown in Figure 10 , deviations in results are minor. 74 58 22 2 0 0 0 63 32 5 0 0 0 31 14 4 1 0 0 0 27 3 0 0 0 0 0 55 22 2 0 0 0 0 0 32 16 4 1 0 0 0 0 0 0 0 74 73 72 71 70 57 21 60 38 12 1 0 0 0 0 0 0 68 52 29 10 3 1 1 1 0 0 63 60 52 38 24 14 9 4 1 1 87 40 2 0 0 0 0 69 16 0 0 0 0 62 26 4 0 0 0 0 23 1 0 0 0 0 0 74 27 1 0 0 0 0 0 49 18 2 0 0 0 0 0 0 0 0 81 54 9 0 0 0 0 83 69 33 3 0 0 0 0 0 0 19 3 1 0 0 0 0 0 0 0 77 47 12 1 0 0 0 0 0 0 88 40 2 0 0 0 0 68 17 0 0 0 0 60 24 3 0 0 0 0 17 1 0 0 0 0 0 71 22 1 0 0 0 0 0 45 15 2 0 0 0 0 0 0 0 0 82 58 12 0 0 0 0 85 78 59 21 2 0 0 0 0 0 25 5 1 0 0 0 0 0 0 0 80 57 19 2 0 0 0 0 0 0 87 29 1 0 0 0 0 59 12 0 0 0 0 54 19 2 0 0 0 0 9 0 0 0 0 0 0 57 11 0 0 0 0 0 0 30 6 0 0 0 0 0 0 0 0 0 82 61 14 1 0 0 0 86 82 74 51 16 2 1 1 1 1 31 6 1 0 0 0 0 0 0 0 81 65 27 4 0 0 0 0 0 0 86 23 1 0 0 0 0 50 9 0 0 0 0 49 16 2 0 0 0 0 6 0 0 0 0 0 0 45 8 0 0 0 0 0 0 29 7 1 0 0 0 0 0 0 0 0 82 63 16 1 0 0 0 86 84 78 67 43 16 5 2 1 1 40 9 2 0 0 0 0 0 0 0 82 68 39 8 1 0 0 0 0 0 85 17 1 0 0 0 0 42 6 0 0 0 0 46 16 2 0 0 0 0 4 0 0 0 0 0 0 35 5 0 0 0 0 0 0 27 7 1 0 0 0 0 0 0 0 0 81 63 19 1 0 0 0 84 83 79 71 56 34 16 7 3 1 52 16 3 1 0 0 0 0 0 0 80 72 44 11 2 0 0 0 0 0 78 8 0 0 0 0 0 24 2 0 0 0 0 26 5 0 0 0 0 0 2 0 0 0 0 0 0 23 2 0 0 0 0 0 0 15 3 0 0 0 0 0 0 0 0 0 74 60 19 1 0 0 0 79 78 76 72 67 58 46 29 10 3 62 34 9 2 0 0 0 0 0 0 75 69 50 19 4 1 0 0 0 0 69 4 0 0 0 0 0 12 1 0 0 0 0 25 8 2 0 0 0 0 4 0 0 0 0 0 0 12 2 0 0 0 0 0 0 5 1 0 0 0 0 0 0 0 0 0 65 49 11 0 0 0 0 70 70 70 68 68 68 65 57 39 17 68 66 63 48 15 2 3 1 0 0 67 65 60 49 30 12 2 0 0 0 62 3 0 0 0 0 0 9 1 0 0 0 0 18 6 1 0 0 0 0 3 0 0 0 0 0 0 8 1 0 0 0 0 0 0 4 1 0 0 0 0 0 0 0 0 0 56 38 6 0 0 0 0 64 65 65 65 64 64 64 62 53 33 62 61 59 52 27 6 34 22 10 2 61 59 56 48 36 19 6 1 0 0 51 3 0 0 0 0 0 8 1 0 0 0 0 22 9 2 0 0 0 0 4 0 0 0 0 0 0 8 1 0 0 0 0 0 0 5 2 0 0 0 0 0 0 0 0 0 47 26 3 0 0 0 0 56 57 57 57 57 57 55 51 41 23 54 54 52 45 21 4 27 19 10 3 53 51 46 39 28 14 3 1 0 0 42 5 1 0 0 0 0 10 2 0 0 0 0 20 9 3 1 0 0 0 6 1 0 0 0 0 0 12 3 0 0 0 0 0 0 7 3 1 0 0 0 0 0 0 0 0 41 26 5 0 0 0 0 51 50 51 50 50 49 48 45 39 28 46 46 45 38 19 4 26 18 8 2 47 44 40 34 22 11 3 0 0 0 86 55 9 0 0 0 0 75 33 2 0 0 0 66 39 11 1 0 0 0 35 2 0 0 0 0 0 64 16 1 0 0 0 0 0 30 8 1 0 0 0 0 0 0 0 0 83 73 34 2 0 0 0 74 49 12 0 0 0 0 0 0 0 82 66 22 4 0 0 0 0 0 0 80 71 41 9 1 0 0 0 0 0 85 37 3 0 0 0 0 64 17 1 0 0 0 60 29 6 1 0 0 0 16 1 0 0 0 0 0 42 5 0 0 0 0 0 0 15 3 0 0 0 0 0 0 0 0 0 83 74 38 2 0 0 0 75 52 17 1 0 0 0 0 0 0 84 79 56 14 1 0 0 0 0 0 80 75 54 18 3 1 0 0 0 0 85 24 2 0 0 0 0 51 10 0 0 0 0 50 22 5 1 0 0 0 8 0 0 0 0 0 0 32 3 0 0 0 0 0 0 12 3 0 0 0 0 0 0 0 0 0 82 73 37 3 0 0 0 72 49 12 0 0 0 0 0 0 0 83 80 72 41 7 0 0 0 0 0 79 74 57 29 6 1 0 0 0 0 84 21 2 0 0 0 0 48 9 0 0 0 0 49 19 4 1 0 0 0 8 0 0 0 0 0 0 31 3 0 0 0 0 0 0 12 3 0 0 0 0 0 0 0 0 0 81 74 40 3 0 0 0 71 44 9 0 0 0 0 0 0 0 82 80 80 64 23 2 1 1 0 0 78 71 57 30 11 2 1 0 0 0 83 23 2 0 0 0 0 50 9 0 0 0 0 45 18 3 0 0 0 0 9 0 0 0 0 0 0 38 5 0 0 0 0 0 0 17 4 1 0 0 0 0 0 0 0 0 80 72 39 4 0 0 0 70 43 6 0 0 0 0 0 0 0 81 77 73 69 61 12 9 2 0 0 76 70 52 27 10 2 0 0 0 0 83 33 3 0 0 0 0 57 13 1 0 0 0 45 17 3 0 0 0 0 14 1 0 0 0 0 0 51 12 1 0 0 0 0 0 27 9 2 0 0 0 0 0 0 0 0 80 72 38 4 0 0 0 68 39 5 0 0 0 0 0 0 0 80 76 70 66 62 53 21 2 0 0 76 70 52 25 6 1 0 0 0 0 82 34 4 0 0 0 0 58 15 1 0 0 0 41 15 2 0 0 0 0 16 1 0 0 0 0 0 58 16 1 0 0 0 0 0 29 8 1 0 0 0 0 0 0 0 0 79 71 42 6 0 0 0 68 40 6 0 0 0 0 0 0 0 77 73 67 61 57 47 29 3 0 0 75 69 51 25 7 1 0 0 0 0 81 37 4 0 0 0 0 59 15 1 0 0 0 40 15 2 0 0 0 0 17 1 0 0 0 0 0 59 16 1 0 0 0 0 0 32 10 2 0 0 0 0 0 0 0 0 78 70 43 9 0 0 0 69 43 9 0 0 0 0 0 0 0 78 72 64 57 52 50 46 9 0 0 75 71 55 25 7 2 0 0 0 0 79 38 5 0 0 0 0 60 16 1 0 0 0 41 14 2 0 0 0 0 22 1 0 0 0 0 0 63 21 1 0 0 0 0 0 31 10 2 0 0 0 0 0 0 0 0 77 69 45 10 0 0 0 69 47 13 1 0 0 0 0 0 0 76 71 62 54 47 49 61 28 3 0 74 71 60 32 10 3 0 0 0 0 77 35 5 0 0 0 0 56 15 1 0 0 0 36 12 2 0 0 0 0 21 1 0 0 0 0 0 59 20 1 0 0 0 0 0 31 10 2 0 0 0 0 0 0 0 0 75 68 47 13 1 0 0 69 50 21 4 1 0 0 0 0 0 74 69 61 53 46 51 66 45 15 1 74 73 67 50 23 8 2 0 0 0 82 46 5 0 0 0 0 68 24 1 0 0 0 58 29 7 1 0 0 0 27 1 0 0 0 0 0 68 27 1 0 0 0 0 0 41 15 2 0 0 0 0 0 0 0 0 75 52 13 1 0 0 0 74 51 13 0 0 0 0 0 0 0 32 7 2 0 0 0 0 0 0 0 85 76 33 4 0 0 0 0 0 0 81 47 6 0 0 0 0 67 26 1 0 0 0 58 30 7 0 0 0 0 26 1 0 0 0 0 0 65 24 1 0 0 0 0 0 46 18 3 0 0 0 0 0 0 0 0 74 55 16 1 0 0 0 76 59 22 2 0 0 0 0 0 0 51 14 3 1 0 0 0 0 0 0 86 83 63 16 1 0 0 0 0 0 79 43 7 0 0 0 0 62 23 2 0 0 0 53 26 6 0 0 0 0 20 1 0 0 0 0 0 54 16 1 0 0 0 0 0 33 11 1 0 0 0 0 0 0 0 0 73 57 23 2 0 0 0 75 60 28 4 0 0 0 0 0 0 65 30 6 1 0 0 0 0 0 0 85 84 79 51 10 1 0 0 0 0 78 31 4 0 0 0 0 51 14 1 0 0 0 44 18 4 1 0 0 0 14 1 0 0 0 0 0 39 8 1 0 0 0 0 0 22 6 1 0 0 0 0 0 0 0 0 74 61 28 3 0 0 0 75 63 36 10 1 0 0 0 0 0 73 53 17 3 0 0 0 0 0 0 85 84 81 74 40 9 1 0 0 0 76 27 4 0 0 0 0 43 11 1 0 0 0 36 15 4 1 0 0 0 13 1 0 0 0 0 0 36 8 1 0 0 0 0 0 19 6 1 0 0 0 0 0 0 0 0 72 63 32 4 0 0 0 74 64 40 14 2 0 0 0 0 0 74 65 35 8 2 0 0 0 0 0 82 82 81 78 69 39 13 2 0 0 74 22 3 0 0 0 0 34 7 1 0 0 0 26 9 2 0 0 0 0 7 0 0 0 0 0 0 25 5 0 0 0 0 0 0 19 6 1 0 0 0 0 0 0 0 0 69 58 31 4 0 0 0 72 60 35 8 1 0 0 0 0 0 69 58 29 6 1 0 0 0 0 0 80 80 77 69 49 26 10 2 0 0 71 25 4 0 0 0 0 28 6 0 0 0 0 12 3 1 0 0 0 0 9 1 0 0 0 0 0 24 4 0 0 0 0 0 0 16 5 1 0 0 0 0 0 0 0 0 68 60 38 9 1 0 0 70 60 36 12 2 0 0 0 0 0 70 67 53 24 5 1 1 1 0 0 79 78 76 74 65 51 40 20 4 0 67 24 5 1 0 0 0 25 6 1 0 0 0 11 4 1 0 0 0 0 7 1 0 0 0 0 0 20 4 1 0 0 0 0 0 15 6 1 0 0 0 0 0 0 0 0 63 59 42 14 1 0 0 66 55 34 13 4 1 0 0 0 0 64 62 53 29 9 2 2 2 1 0 75 75 74 72 67 61 48 34 16 3 66 34 10 2 0 0 0 41 14 2 0 0 0 25 11 3 1 0 0 0 15 3 0 0 0 0 0 29 8 2 0 0 0 0 0 20 8 3 1 0 0 0 0 0 0 0 64 62 53 27 3 0 0 65 56 37 Algorithm 1 Pseudocode for the Frank-Wolfe algorithm for the L 1 attack. 1: Input: function f , initial input x ∈ [0, 1] d , L 1 radius ρ, number of steps T . 2: Output: approximate maximizer x of f over the truncated L 1 ball B 1 (ρ; x) ∩ [0, 1] d centered at x.

3:

4: x (0) ← RandomInit(x) {Random initialization} 5: for t = 1, . . . , T do 6: g ← ∇f (x (t-1) ) {Obtain gradient} 7: for k = 1, . . . , d do 8: s k ← index of the coordinate of g by with k th largest norm 9: end for 10: S k ← {s 1 , . . . , s k }. 11: 12: {Compute move to boundary of [0, 1] for each coordinate.} 13: for i = 1, . . . , d do 14: if g i > 0 then 15: b i ← 1 -x i 16: else 17: b i ← -x i 18: end if 19: end for 20: M k ← i∈S k |b i | {Compute L 1 -perturbation of moving k largest coordinates.} 21: k * ← max{k | M k ≤ ρ} {Choose largest k satisfying L 1 constraint.} 22: 23: {Compute x maximizing g x over the L 1 ball.} 24: for i = 1, . . . , d do 25: if i ∈ S k * then 26: xi ← x i + b i 27: else if i = s k * +1 then 28: xi ← x i + (ρ -M k * ) sign(g i ) 29: else 30: xi ← x i 31: end if

32:

end for 33: x (t) ← (1 -1 t )x (t-1) + 1 t x {Average x with previous iterates} 34: end for 35: x ← x (T )

F.2 CONVERGENCE

We replicated the results in Figure 6 with 50 instead of 200 steps to see how the results changed based on the number of steps in the attack. As shown in Figure 11 , the deviations are minor.  L ∞ L 2 L 1 L ∞ -J P E G L 2 -J P E G E la st ic F o g G a b o r S n o w Normal Training L∞ ε = 1 L∞ ε = 2 L∞ ε = 4 L∞ ε = 8 L∞ ε = 16 L∞ ε = 32 L∞-JPEG ε = 0.0625 L∞-JPEG ε = 0.125 L∞-JPEG ε = 0.25 L∞-JPEG ε = 0.5 L∞-JPEG ε = 1 L∞-JPEG ε = 2 Fog ε = 128 L ∞ L 2 L 1 L ∞ -J P E G L 2 -J P E G E la st ic F o g G a b o r S n o w Normal Training L2 ε = 150 L2 ε = 300 L2 ε = 600 L2 ε = 1200 L2 ε = 2400 L2 ε = 4800 L2-JPEG ε = 8 L2-JPEG ε = 16 L2-JPEG ε = 32 L2-JPEG ε = 64 L2-JPEG ε = 128 L2-JPEG ε = 256 Gabor ε = 6.25 Gabor ε = 12.5 Gabor ε = 25 Gabor ε = 400 Gabor ε = 800 Gabor ε = 1600 N o a tt a c k  7 L ∞ L 2 L 1 L ∞ -J P E G L 2 -J P E G E la st ic F o g G L 2 - JP E G = 1 L 2 - JP E G = 2 L 2 - JP E G = 4 L 2 - JP E G = 8 L 1 - JP E G = 1 L 1 - JP E G = 2 L 1 - JP E G = 4 L 1 - JP E G = 8 L 1 - JP E G = 1 6 L 1 - JP E G = 3 2 L 1 - JP E G = 6 4 L 1 - JP E G = 1 2 8 L 1 - JP E G = ∞ L 2 L 1 L ∞ -J P E G L 1 -J P E G E la s t ic Normal Training L ∞ ε = 1 L ∞ ε = 2 L ∞ ε = 4 L ∞ ε = 8 L ∞ ε = 16 L ∞ ε = 32 L ∞ -JPEG ε = 0.03125 L ∞ -JPEG ε = 0.0625 L ∞ -JPEG ε = 0.125 L ∞ -JPEG ε = 0.25 L ∞ -JPEG ε = 0.5 L ∞ -JPEG ε = 1 L ∞ L 2 L 1 L ∞ -J P E G L 1 -J P E G E la s t ic Normal Training L 2 ε = 40 L 2 ε = 80 L 2 ε = 160 L 2 ε = 320 L 2 ε = 640 L 2 ε = 2560 L 1 -JPEG ε = 2 L 1 -JPEG ε = 8 L 1 -JPEG ε = 64 L 1 -JPEG ε = 256 L 1 -JPEG ε = 512 L 1 -JPEG ε = L ∞ L 2 L 1 L ∞ -J P E G L 1 -J P E G E L ∞ ε = 1 L ∞ ε = 2 L ∞ ε = 4 L ∞ ε = 8 L ∞ ε = 1 6 L ∞ ε =



& Boneh, 2019). We replace PGD with the Frank-Wolfe algorithm(Frank & Wolfe, 1956), which



Figure 3: Accuracies of L 2 and Elastic attacks at different distortion sizes against a ResNet-50 model adversarially trained against L 2 at ε = 9600 on ImageNet-100. At small distortion sizes, the model appears to defend well against Elastic, but large distortion sizes reveal that robustness does not transfer from L 2 to Elastic.

Figure 4: UAR for adv trained defenses (row) against attacks (col) on ImageNet-100. Defenses from L ∞ to Gabor were trained with ε = 32, 4.8k, 612k, 2, 16, 8192, 8, and 1.6k.

L2 = 150 L2 = 300 L2 = 600 L2 = 1200 L2 = 2400 L2 = 4800 L1 = 9562.44 L1 = 19125 L1 = 38250.1 L1 = 76500 L1 = 153000 L1 = 306000 L1 = 612000 L -JPEG = 0.03125 L -JPEG = 0.0625 L -JPEG = 0.125 L -JPEG = 0.25 L -JPEG = 0.5 L -JPEG = 1 L -JPEG = 2 L2-JPEG = 2 L2-JPEG = 4 L2-JPEG = 8 L2-JPEG = 16 L2-JPEG = 32 L2-JPEG = 64 L2-JPEG = 128 L2-JPEG = 256 L1-JPEG = 128 L1-JPEG = 256 L1-JPEG = 512 L1-JPEG = 1024 L1-JPEG = 2048 L1-JPEG = 4096 L1-JPEG = 8192 L1-JPEG = 16384 L1-JPEG = 32768 L1-JPEG = 65536 L1-JPEG = 131072 Elastic =

Figure 6: Accuracy of adversarial attack (column) against adversarially trained model (row) on ImageNet-100.



Figure 7: UAR scores for adv. trained defenses (rows) against distortion types (columns) for ImageNet-100.

Figure 8: Accuracy of adversarial attack (column) against adversarially trained model (row) on CIFAR-10.



Figure 9: UAR scores on CIFAR-10. Displayed UAR scores are multiplied by 100 for clarity.

Figure 10: Replica of the first three block rows of Figure 6 with different random seeds. Deviations in results are minor.

Clean Accuracy, UAR, and mUAR scores for models adv trained against L ∞ and L 2 attacks. L ∞ training, the most popular defense, provides less robustness than L 2 training. Comparing the highest mUAR achieved to individual UAR values in Figure4indicates a large robustness gap.

Clean Accuracy, UAR, and mUAR scores for models jointly trained against (L ∞ , L 2 ). Joint training does not provide much additional robustness.

vanilla ResNeXt baseline. Finally,Hendrycks et al. (2020) create AugMix, which randomly mixes stochastically generated augmentations. Although AugMix did not use random nor adversarial noise, it improves robustness to unforeseen attacks by 10%. These results imply that defenses not relying on adversarial examples can improve ImageNet-UA performance. They indicate that training on more data only somewhat increases robustness on ImageNet-UA, unlike many other robustness benchmarks

Table 7 shows joint training against Calibrated distortion sizes and ATA values for different distortion types on ImageNet-100.

Calibrated distortion sizes and ATA values for ResNet-56 on CIFAR-10

UAR scores for L 1 -trained models and (L ∞ , L 1 )-jointly trained models. At each distortion size, L 1 -training performs better than joint training.

annex

Snow = 16 87 28 2 0 0 0 0 58 13 0 0 0 0 70 44 13 2 0 0 0 22 1 0 0 0 0 0 72 28 2 0 0 0 0 0 59 32 9 1 0 0 0 0 0 0 0 80 50 9 0 0 0 0 70 37 6 0 0 0 0 0 0 0 18 5 2 1 0 0 0 0 0 0 64 37 10 1 0 0 0 0 0 0 86 84 70 15 0 0 0 85 82 50 3 0 0 81 70 43 10 1 0 0 84 72 16 0 0 0 0 86 84 68 13 0 0 0 0 77 64 36 10 1 0 0 0 0 0 0 84 75 39 4 0 0 0 78 60 23 3 0 0 0 0 0 0 75 34 6 1 0 0 0 0 0 0 80 67 33 7 1 0 0 0 0 0 87 42 2 0 0 0 0 70 19 1 0 0 0 68 40 11 2 0 0 0 25 1 0 0 0 0 0 74 31 1 0 0 0 0 0 58 29 7 1 0 0 0 0 0 0 0 81 57 12 1 0 0 0 85 78 52 12 1 0 0 0 0 0 24 7 2 1 0 0 0 0 0 0 76 49 17 2 0 0 0 0 0 0 88 41 3 0 0 0 0 69 19 1 0 0 0 66 37 9 1 0 0 0 20 0 0 0 0 0 0 72 25 1 0 0 0 0 0 56 25 5 0 0 0 0 0 0 0 0 82 61 16 1 0 0 0 87 82 71 40 6 0 0 0 1 0 32 8 2 1 0 0 0 0 0 0 79 59 23 3 0 0 0 0 0 0 87 31 2 0 0 0 0 62 14 0 0 0 0 64 33 7 1 0 0 0 10 0 0 0 0 0 0 59 13 0 0 0 0 0 0 42 13 2 0 0 0 0 0 0 0 0 82 62 19 1 0 0 0 86 84 80 66 32 6 1 1 1 1 37 9 2 0 0 0 0 0 0 0 81 64 30 6 0 0 0 0 0 0 86 25 1 0 0 0 0 53 10 0 0 0 0 61 31 7 0 0 0 0 7 0 0 0 0 0 0 48 9 0 0 0 0 0 0 43 15 2 0 0 0 0 0 0 0 0 82 64 21 2 0 0 0 86 85 82 76 60 30 8 2 1 1 48 14 4 1 0 0 0 0 0 0 81 68 39 9 1 0 0 0 0 0 85 19 1 0 0 0 0 44 8 0 0 0 0 58 30 6 1 0 0 0 5 0 0 0 0 0 0 38 5 0 0 0 0 0 0 41 14 2 0 0 0 0 0 0 0 0 81 66 23 2 0 0 0 84 84 82 78 70 54 30 12 3 1 60 25 6 1 0 0 0 0 0 0 81 72 44 11 1 0 0 0 0 0 78 9 0 0 0 0 0 26 3 0 0 0 0 37 12 1 0 0 0 0 2 0 0 0 0 0 0 25 3 0 0 0 0 0 0 25 7 1 0 0 0 0 0 0 0 0 74 60 23 2 0 0 0 79 79 78 76 73 68 58 38 12 3 67 46 18 4 1 0 0 0 0 0 75 70 50 20 4 0 0 0 0 0 69 4 0 0 0 0 0 14 2 0 0 0 0 36 16 4 1 0 0 0 5 0 0 0 0 0 0 14 2 0 0 0 0 0 0 9 2 0 0 0 0 0 0 0 0 0 64 50 12 1 0 0 0 70 70 70 70 69 69 67 61 45 24 68 67 64 53 23 6 8 6 4 1 66 65 58 44 21 4 1 0 0 0 62 3 0 0 0 0 0 10 1 0 0 0 0 27 12 3 0 0 0 0 3 0 0 0 0 0 0 9 1 0 0 0 0 0 0 8 2 0 0 0 0 0 0 0 0 0 56 38 7 0 0 0 0 65 65 66 66 66 66 66 64 55 35 62 61 60 55 36 18 44 39 29 15 61 60 54 44 28 10 1 0 0 1 51 3 1 0 0 0 0 9 1 0 0 0 0 29 14 4 1 0 0 0 4 0 0 0 0 0 0 8 1 0 0 0 0 0 0 8 3 1 0 0 0 0 0 0 0 0 46 27 3 0 0 0 0 57 58 59 58 59 59 57 53 43 24 54 54 53 47 28 12 35 32 24 14 53 51 46 36 18 5 1 0 0 0 42 6 1 0 0 0 0 10 2 0 0 0 0 26 14 5 1 0 0 0 7 1 0 0 0 0 0 12 3 1 0 0 0 0 0 12 4 1 0 0 0 0 0 0 0 0 41 26 5 0 0 0 0 51 51 52 51 51 51 49 47 40 29 47 47 46 41 24 12 35 30 22 12 47 45 40 30 16 4 1 0 0 0 86 56 10 0 0 0 0 75 35 3 0 0 0 70 51 22 4 0 0 0 37 3 0 0 0 0 0 65 20 1 0 0 0 0 0 43 16 3 0 0 0 0 0 0 0 0 83 73 37 3 0 0 0 78 61 25 3 0 0 0 0 0 0 83 71 32 7 1 0 0 0 0 0 80 71 44 12 2 0 0 0 0 0 85 39 4 0 0 0 0 64 19 1 0 0 0 66 42 15 3 0 0 0 18 1 0 0 0 0 0 46 6 0 0 0 0 0 0 27 7 1 0 0 0 0 0 0 0 0 82 74 41 4 0 0 0 78 63 31 7 1 0 0 0 0 0 84 81 65 25 4 1 1 1 0 0 80 75 58 23 4 1 0 0 0 0 85 26 2 0 0 0 0 53 12 1 0 0 0 60 35 12 2 0 0 0 10 0 0 0 0 0 0 34 4 0 0 0 0 0 0 21 6 1 0 0 0 0 0 0 0 0 82 74 41 4 0 0 0 76 58 23 3 0 0 0 0 0 0 83 82 79 54 13 1 1 1 1 0 79 75 61 33 8 2 0 0 0 0 84 23 2 0 0 0 0 50 10 1 0 0 0 58 34 11 2 0 0 0 10 1 0 0 0 0 0 34 4 0 0 0 0 0 0 22 6 1 0 0 0 0 0 0 0 0 82 73 43 5 0 0 0 75 57 21 2 0 0 0 0 0 0 83 82 81 78 48 5 4 3 2 0 78 74 62 39 15 4 1 0 0 0 83 26 2 0 0 0 0 52 10 1 0 0 0 57 30 9 1 0 0 0 10 0 0 0 0 0 0 40 6 0 0 0 0 0 0 28 9 2 0 0 0 0 0 0 0 0 80 73 41 5 0 0 0 75 56 19 1 0 0 0 0 0 0 81 80 79 80 78 42 28 20 8 2 77 71 57 35 13 4 1 0 0 0 83 34 4 0 0 0 0 57 15 1 0 0 0 53 29 9 1 0 0 0 17 1 0 0 0 0 0 53 14 1 0 0 0 0 0 39 16 4 1 0 0 0 0 0 0 0 80 72 42 7 0 0 0 73 53 17 1 0 0 0 0 0 0 81 79 78 79 80 78 64 36 8 1 77 71 55 31 10 2 0 0 0 0 82 36 4 0 0 0 0 58 16 1 0 0 0 50 24 7 1 0 0 0 18 1 0 0 0 0 0 59 18 1 0 0 0 0 0 39 17 4 1 0 0 0 0 0 0 0 79 71 44 8 0 0 0 72 54 18 1 0 0 0 0 0 0 79 77 76 76 76 78 73 45 11 1 76 70 56 32 9 3 1 0 0 1 81 38 5 0 0 0 0 60 17 1 0 0 0 49 23 6 1 0 0 0 19 1 0 0 0 0 0 60 18 1 0 0 0 0 0 42 20 5 1 0 0 0 0 0 0 0 78 71 45 11 0 0 0 73 55 21 2 0 0 0 0 0 0 79 76 74 73 75 77 77 62 24 3 75 71 58 30 10 3 1 0 0 0 79 40 6 0 0 0 0 60 18 1 0 0 0 49 24 6 0 0 0 0 24 1 0 0 0 0 0 63 23 2 0 0 0 0 0 41 18 5 1 0 0 0 0 0 0 0 76 70 47 13 0 0 0 72 55 25 5 1 0 0 0 0 0 77 74 71 70 72 76 79 73 49 12 75 71 62 37 15 5 2 0 0 0 77 36 6 0 0 0 0 57 17 1 0 0 0 44 19 4 1 0 0 0 23 1 0 0 0 0 0 60 21 2 0 0 0 0 0 40 19 5 1 0 0 0 0 0 0 0 74 68 48 16 1 0 0 71 58 31 10 2 1 1 1 0 0 75 72 70 69 69 75 78 75 65 36 74 73 69 52 26 10 4 1 0 0 82 47 6 0 0 0 0 69 26 1 0 0 0 62 41 14 3 0 0 0 29 2 0 0 0 0 0 68 29 2 0 0 0 0 0 48 24 5 1 0 0 0 0 0 0 0 74 54 17 2 0 0 0 78 65 32 5 0 0 0 0 0 0 37 11 3 1 0 0 0 0 0 0 85 75 37 6 1 0 0 0 0 0 81 48 8 0 0 0 0 68 29 2 0 0 0 63 39 14 2 0 0 0 28 1 0 0 0 0 0 65 27 2 0 0 0 0 0 52 28 7 1 0 0 0 0 0 0 0 75 55 20 3 0 0 0 79 69 39 8 0 0 0 0 0 0 56 20 5 1 0 0 0 0 0 0 86 82 65 18 2 0 0 0 0 0 79 44 8 0 0 0 0 63 25 2 0 0 0 58 36 13 2 0 0 0 22 2 0 0 0 0 0 56 18 1 0 0 0 0 0 41 17 4 1 0 0 0 0 0 0 0 74 58 28 4 0 0 0 78 68 42 12 2 0 0 0 0 0 67 38 10 3 1 0 0 0 0 0 85 84 80 53 9 1 0 0 0 0 78 33 5 0 0 0 0 53 16 1 0 0 0 51 29 10 2 0 0 0 15 1 0 0 0 0 0 41 10 1 0 0 0 0 0 30 12 3 0 0 0 0 0 0 0 0 74 61 31 5 0 0 0 78 69 45 18 4 1 0 0 0 0 75 60 25 7 1 0 1 0 0 0 85 84 83 76 44 8 1 0 0 0 76 27 5 0 0 0 0 44 12 1 0 0 0 43 23 7 2 0 0 0 16 2 0 0 0 0 0 38 9 1 0 0 0 0 0 28 12 3 1 0 0 0 0 0 0 0 73 63 36 6 0 0 0 76 69 48 22 6 1 1 1 0 0 74 68 44 14 3 1 1 1 0 0 82 82 81 79 71 44 10 1 0 0 74 23 4 0 0 0 0 35 8 1 0 0 0 34 14 4 1 0 0 0 8 1 0 0 0 0 0 27 5 0 0 0 0 0 0 28 12 3 0 0 0 0 0 0 0 0 69 60 33 6 0 0 0 74 66 45 18 4 1 0 0 0 0 70 61 39 14 4 2 2 1 1 0 80 80 77 69 51 30 14 2 0 0 71 27 5 0 0 0 0 30 6 1 0 0 0 18 6 2 0 0 0 0 9 1 0 0 0 0 0 25 5 1 0 0 0 0 0 25 11 3 1 0 0 0 0 0 0 0 68 60 40 11 1 0 0 72 64 45 21 6 1 1 1 0 0 70 68 59 35 9 3 3 2 2 1 78 78 78 75 64 50 39 19 3 1 67 26 6 1 0 0 0 27 6 1 0 0 0 17 6 2 0 0 0 0 9 1 0 0 0 0 0 21 5 1 0 0 0 0 0 25 11 3 1 0 0 0 0 0 0 0 64 58 42 16 2 0 0 67 60 41 19 7 2 1 0 1 1 65 62 57 41 18 8 8 5 3 2 75 75 74 72 68 60 51 39 14 2 66 36 11 2 0 0 0 41 16 3 0 0 0 32 17 6 1 0 0 0 16 3 0 0 0 0 0 30 9 2 0 0 0 0 0 29 15 6 2 0 0 0 0 0 0 0 64 61 54 29 4 1 0 66 60 43 

