EXPLOITING VERIFIED NEURAL NETWORKS VIA FLOATING POINT NUMERICAL ERROR

Abstract

Motivated by the need to reliably characterize the robustness of deep neural networks, researchers have developed verification algorithms for deep neural networks. Given a neural network, the verifiers aim to answer whether certain properties are guaranteed with respect to all inputs in a space. However, little attention has been paid to floating point numerical error in neural network verification. We exploit floating point errors in the inference and verification implementations to construct adversarial examples for neural networks that a verifier claims to be robust with respect to certain inputs. We argue that, to produce sound verification results, any verification system must accurately (or conservatively) model the effects of any float point computations in the network inference or verification system.

1. INTRODUCTION

Deep neural networks (DNNs) are known to be vulnerable to adversarial inputs (Szegedy et al., 2014) , which are images, audio, or texts indistinguishable to human perception that cause a DNN to give substantially different results. This situation has motivated the development of network verification algorithms that claim to prove the robustness of a network (Bunel et al., 2020; Tjeng et al., 2019; Salman et al., 2019) , specifically that the network produces identical classifications for all inputs in a perturbation space around a given input. Verification algorithms typically reason about the behavior of the network assuming real-valued arithmetic. In practice, however, the computation of both the verifier and the neural network is performed on physical computers that use floating point numbers and floating point arithmetic to approximate the underlying real-valued computations. This use of floating point introduces numerical error that can potentially invalidate the guarantees that the verifiers claim to provide. Moreover, the existence of multiple software and hardware systems for DNN inference further complicates the situation, because different implementations exhibit different numerical error characteristics. We present concrete instances where numerical error leads to unsound verification of real-valued networks. Specifically, we train robust networks on the MNIST and CIFAR10 datasets. We work with the MIPVerify complete verifier (Tjeng et al., 2019) and several inference implementations included in the PyTorch (Paszke et al., 2019) framework. For each implementation, we construct image pairs (x 0 , x adv ) where x 0 is a brightness modified natural image, such that the implementation classifies x adv differently from x 0 , x adv falls in a ∞ -bounded perturbation space around x 0 , and the verifier incorrectly claims that no such adversarial image x adv exists for x 0 within the perturbation space. Moreover, we show that the incomplete verifier CROWN is also vulnerable to floating point error. Our method of constructing adversarial images is not limited to our setting, and it is applicable to other verifiers that do not soundly model floating point arithmetic.

2. BACKGROUND AND RELATED WORK

Training robust networks: Researchers have developed various techniques to train robust networks (Madry et al., 2018; Mirman et al., 2018; Tramer & Boneh, 2019; Wong et al., 2020) . Madry et al. formulate the robust training problem as minimizing the worst loss within the input perturbation and propose to train robust networks on the data generated by the Projected Gradient Descent (PGD) adversary (Madry et al., 2018) . In this work we consider robust networks trained with the PGD adversary.

Complete verification:

The goal of complete verification (a.k.a. exact verification) methods is to either prove the property being verified or provide a counterexample to disprove it. Complete verification approaches have formulated the verification problem as a Satisfiability Modulo Theories (SMT) problem (Scheibler et al., 2015; Huang et al., 2017; Katz et al., 2017; Ehlers, 2017; Bunel et al., 2020) or as a Mixed Integer Linear Programming (MILP) problem (Lomuscio & Maganti, 2017; Cheng et al., 2017; Fischetti & Jo, 2018; Dutta et al., 2018; Tjeng et al., 2019) . While SMT solvers are able to model exact floating point arithmetic (Rümmer & Wahl, 2010) or exact real arithmetic (Corzilius et al., 2012) , deployed SMT solvers for verifying neural networks all use inexact floating point arithmetic to reason about the neural network inference for efficiency reasons. MILP solvers work directly with floating point, do not attempt to exactly model real arithmetic, and therefore exhibit numerical error. Since floating point arithmetic is not associative, different neural network implementations may produce different results for the same neural network, implying that any sound verifier for this class of networks must reason about the specific floating point error characteristics of the neural network implementation at hand. To the best of our knowledge, no prior work formally recognizes the problem of floating point error in neural network complete verification or exploits floating point error to invalidate verification results. Incomplete verification: On the spectrum of the tradeoff between completeness and scalability, incomplete methods (a.k.a. certification methods) aspire to deliver more scalable verification by adopting over-approximation, while admitting the inability to either prove or disprove the properties in certain cases. There is a large body of related research (Wong & Kolter, 2017; Weng et al., 2018; Gehr et al., 2018; Zhang et al., 2018; Raghunathan et al., 2018; Dvijotham et al., 2018; Mirman et al., 2018; Singh et al., 2019) . Salman et al. (2019) has unified most of the relaxation methods under a common convex relaxation framework. Their results suggest that there is an inherent barrier to tight verification via layer-wise convex relaxation captured by their framework. We highlight that floating point error of implementations that use a direct dot product formulation has been accounted for in some certification frameworks (Singh et al., 2018; 2019) by maintaining upper and lower rounding bounds for sound floating point arithmetic (Miné, 2004) . Such frameworks should be extensible to model numerical error in more sophisticated implementations like the Winograd convolution (Lavin & Gray, 2016) , but the effectiveness of this extension remains to be studied. Most of the certification algorithms, however, have not considered floating point error and may be vulnerable to attacks that exploit this deficiency. Floating point arithmetic: Floating point is widely adopted as an approximate representation of real numbers in digital computers. After each calculation, the result is rounded to the nearest representable value, which induces roundoff error. In the field of neural networks, the SMT-based verifier Reluplex (Katz et al., 2017) has been observed to produce false adversarial examples due to floating point error (Wang et al., 2018) . The MILP-based verifier MIPVerify (Tjeng et al., 2019) has been observed to give NaN results when verifying pruned neural networks (Guidotti et al., 2020) . Such observed floating point unsoundness behavior occurs unexpectedly in running large scale benchmarks. However, no prior work tries to systematically invalidate neural network verification results via exploiting floating point error. The IEEE-754 (IEEE, 2008) standard defines the semantics of operations and correct rounding behavior. On an IEEE-754 compliant implementation, computing floating point expressions consisting of multiple steps that are equivalent in the real domain may result in different final roundoff error because rounding is performed after each step, which complicates the error analysis. Research on estimating floating point roundoff error and verifying floating point programs has a long history and is actively growing (Boldo & Melquiond, 2017 ), but we are unaware of any attempt to apply these tools to obtain a sound verifier for any neural network inference implementation. Any such verifier must reason soundly about floating point errors in both the verifier and the neural network inference algorithm. The failure to incorporate floating point error in software systems has caused real-world disasters. For example, in 1992, a Patriot missile missed its target and lead to casualties due to floating point roundoff error related to time calculation (Skeel, 1992) .

3.1. ADVERSARIAL ROBUSTNESS OF NEURAL NETWORKS

We consider 2D image classification problems. Let y = NN (x; W ) denote the classification confidence given by a neural network with weight parameters W for an input x, where x ∈ R m×n×c [0, 1] is an image with m rows and n columns of pixels each containing c color channels represented by floating point values in the range [0, 1], and y ∈ R k is a logits vector containing the classification scores for each of the k classes. The class with the highest score is the classification result of the neural network. For a logits vector y and a target class number t, we define the Carlini-Wagner (CW) loss (Carlini & Wagner, 2017) as the score of the target class subtracted by the maximal score of the other classes: L CW (y, t) = y t -max i =t y i (1) Note that x is classified as an instance of class t if and only if L CW (NN (x; W ) , t) > 0, assuming no equal scores of two classes. Adversarial robustness of a neural network is defined for an input x 0 and a perturbation bound , such that the classification result is stable within allowed perturbations: ∀x ∈ Adv (x 0 ) : L CW (NN (x; W ) , t 0 ) > 0 (2) where t 0 = argmax NN (x 0 ; W ) In this work we focus on ∞ -norm bounded perturbations: Adv (x 0 ) = {x | x -x 0 ∞ ≤ ∧ min x ≥ 0 ∧ max x ≤ 1} (3) 3.2 FINDING ADVERSARIAL EXAMPLES FOR VERIFIED NETWORKS VIA EXPLOITING NUMERICAL ERROR Due to the inevitable presence of numerical error in both the network inference system and the verifier, the exact specification of NN (•; W ) (i.e., a bit-level accurate description of the underlying computation) is not clearly defined in (2). We consider the following implementations of convolutional layers included in the PyTorch framework to serve as our candidate definitions of the convolutional layers in NN (•; W ), and other layers use the default PyTorch implementation: (Lavin & Gray, 2016) , which has much higher numerical error compared to others. • NN C,M For a given implementation NN impl (•; W ), our method finds pairs of (x 0 , x adv ) represented as single precision floating point numbers such that 1. x 0 and x adv are in the dynamic range of images: min x 0 ≥ 0, min x adv ≥ 0, max x 0 ≤ 1, and max x adv ≤ 1.

2.

x adv falls in the perturbation space of x 0 : x adv -x 0 ∞ ≤ 3. The verifier claims that (2) holds for x 0 4. x adv is an adversarial image for the implementation: L CW (NN impl (x adv ; W ) , t 0 ) < 0 Note that the first two conditions are accurately defined for any implementation compliant with the IEEE-754 standard, because the computation only involves element-wise subtraction and maxreduction that incur no accumulated error. The Gurobi (Gurobi Optimization, 2020) solver used by MIPVerify operates with double precision internally. Therefore, to ensure that our adversarial examples satisfy the constraints considered by the solver, we also require that the first two conditions hold for x adv = float64 (x adv ) and x 0 = float64 (x 0 ) that are double precision representations of x adv and x 0 .

3.3. MILP FORMULATION FOR COMPLETE VERIFICATION

We adopt the small CNN architecture from Xiao et al. (2019) and the MIPVerify complete verifier of Tjeng et al. (2019) to demonstrate our attack method. We can also deploy our method against other complete verifiers as long as the property being verified involves thresholding continuous variables whose floating point arithmetic is not exactly modeled in the verification process. The MIPVerify verifier formulates the verification problem as an MILP problem for networks composed of linear transformations and piecewise-linear functions (Tjeng et al., 2019 ). An MILP problem optimizes a linear objective function subject to linear equality and linear inequality constraints over a set of variables, where some variables take real values while others are restricted to be integers. The MILP formulation of the robustness of a neural network involves three parts: introducing free variable x for the adversarial input subject to the constraint x ∈ Adv (x 0 ), formulating the computation y = NN (x; W ), and formulating the attack goal L CW (NN (x; W ) , t 0 ) ≤ 0. The network is robust with respect to x 0 if the MILP problem is infeasible, and x serves as an adversarial image otherwise. The MILP problem typically optimizes one of the two objective functions: (i) min xx 0 ∞ to find an adversarial image closest to x, or (ii) min L CW (NN (x; W ) , t 0 ) to find an adversarial image that causes the network to produce a different prediction with the highest confidence. Note that although the above constraints and objective functions are nonlinear, most modern MILP solvers can handle them by automatically introducing necessary auxiliary decision variables to convert them into linear forms.

4.1. EMPIRICAL CHARACTERIZATION OF IMPLEMENTATION NUMERICAL ERROR

To guide the design of our attack algorithm we present statistics about numerical error of different implementations. To investigate end-to-end error behavior, we select an image x and present in Figure 1a a plot of NN (x + δ; W ) -NN (x; W ) ∞ against -10 -6 ≤ δ ≤ 10 -6 , where the addition of x + δ is only applied on the single input element that has the largest gradient magnitude. To minimize the effect of numerical instability due to nonlinearity in the network and focus on fluctuations caused by numerical error, the image x is chosen to be the first MNIST test image on which the network produces a verified robust prediction. We have also checked that the pre-activation values of all the ReLU units do not switch sign. We observe that the change of the logits vector is highly nonlinear with respect to the change of the input, and a small perturbation could result in a large fluctuation. The WINOGRAD_NONFUSED algorithm on NVIDIA GPU is much more unstable and its variation is two orders of magnitude larger than the others. We also evaluate all of the implementations on the whole MNIST test set and compare the outputs of the first layer (i.e., with only one linear transformation applied to the input) against that of NN C,M , and present the histogram in Figure 1b . It is clear that different implementations usually manifest different error behavior, and again NN G,CWG induces much higher numerical error than others. These observations inspire us to construct adversarial images for each implementation independently by applying small random perturbations on an image close to the robustness decision boundary. We present the details of our method in Section 4.2. (a) Change of logits vector due to small single-element input perturbations for different implementations. The dashed lines are y = |δ|. This plot shows that the change of output is nonlinear with respect to input changes, and the magnitude of output changes is usually larger than that of input changes. The changes are due to floating point error rather than network nonlinearity because all the pre-activation values of ReLU units do not switch sign.  |y0 y1| H 3UREDELOLW\ NN first C, C NN first C, M |y0 y1| H 3UREDELOLW\ NN first G, M NN first C, M |y0 y1| H 3UREDELOLW\ NN first G, C NN first C, M |y0 y1| H 3UREDELOLW\ NN first G, CWG NN first C, M

4.2. CONSTRUCTING ADVERSARIAL EXAMPLES

Given a network and weights NN (•; W ), there exist image pairs (x 0 , x 1 ) such that the network is verifiably robust with respect to x 0 , while x 1 ∈ Adv (x 0 ) and L CW (NN (x 1 ; W ) , t 0 ) is less than the numerical fluctuation introduced by tiny input perturbations. We call x 0 a quasi-safe image and x 1 the corresponding quasi-adversarial image. We then apply small random perturbations on the quasi-adversarial image to obtain an adversarial image. The process is illustrated in Figure 2 . We propose the following proposition for a more formal and detailed description: Proposition 1. Let E > 0 be an arbitrarily small positive number. If a continuous neural network NN (•; W ) can produce a verifiably robust classification for class t, and it does not constantly classify all inputs as class t, then there exists an input x 0 such that 0 < min x∈Adv (x0) L CW (NN (x; W ) , t) < E Let x 1 = argmin x∈Adv (x0) L CW (NN (x; W ) , t ) be the minimizer of the above function. We call x 0 a quasi-safe image and x 1 a quasi-adversarial image. Proof. Let f (x) := min x ∈Adv (x) L CW (NN (x ; W ) , t). Since f (•) is composed of continuous functions, f (•) is continuous. Suppose NN (•; W ) is verifiably robust with respect to x + that belongs to class t. Let x -be be any input such that L CW (NN (x -; W ) , t) < 0, which exists because NN (•; W ) does not constantly classify all inputs as class t. We have f (x + ) > 0 and f (x -) < 0, and therefore x 0 exists such that 0 < f (x 0 ) < E due to continuity. Our method works by choosing E to be a number smaller than the average fluctuation of logits vector introduced by tiny input perturbations as indicated in Figure 1a , and finding a quasi-safe image by adjusting the brightness of a natural image. An adversarial image is then likely to be obtained by applying random perturbations on the corresponding quasi-adversarial image. Given a particular implementation NN impl (•; W ) and a natural image x seed which the network robustly classifies as class t 0 according to the verifier, we construct an adversarial input pair (x 0 , x adv ) that meets the constraints described in Section 3.2 in three steps: 1. We search for a coefficient α ∈ [0, 1] such that x 0 = αx seed serves as the quasi-safe image. Specifically, we require the verifier to claim that the network is robust for αx seed but not so for (α -δ)x seed with δ being a small positive value. Although the function is not guaranteed to be monotone, we can still use a binary search to find α while minimizing δ because we only need one such value. However, we observe that in many cases the MILP solver becomes extremely slow for small δ values, so we start with a binary search and switch to grid search if the solver exceeds a time limit. We set the target of δ to be 1e-7 in our experiments and divide the best known δ to 16 intervals if grid search is needed. 2. We search for the quasi-adversarial image x 1 corresponding to x 0 . We define a loss function with a tolerance of τ as L(x, τ ; W , t 0 ) := L CW (NN (x; W ) , t 0 ) -τ , which can be incorporated in any verifier by modifying the bias of the Softmax layer. We aim to find τ 0 which is the minimal confidence of all images in the perturbation space of x 0 , and τ 1 which is slightly larger than τ 0 with x 1 being the corresponding adversarial image:      ∀x ∈ Adv (x 0 ) : L(x 0 , τ 0 ; W , t 0 ) > 0 x 1 ∈ Adv (x 0 ) L(x 1 , τ 1 ; W , t 0 ) < 0 τ 1 -τ 0 < 1e-7 Note that x 1 is produced by the complete verifier as a proof for nonrobustness given the tolerance τ 1 . The above values are found via a binary search with initialization τ 0 ← 0 and τ 1 ← τ max where τ max := L CW (NN (x 0 ; W ) , t 0 ). If the verifier is able to compute the worst objective τ w = min x∈Adv (x0) L CW (NN (x; W ) , t 0 ), the binary search can be accelerated by initializing τ 0 ← τ w -δ s and τ 1 ← τ w + δ s . We empirically set δ s = 3e-6 to incorporate the numerical error in the verifier so that L(x 0 , τ w -δ s ; W , t 0 ) > 0 and L(x 0 , τ w + δ s ; W , t 0 ) < 0. The binary search is aborted if the solver times out. 3. We minimize L CW (NN (x 1 ; W ) , t 0 ) with hill climbing via applying small random perturbations on the quasi-adversarial image x 1 while projecting back to Adv (x 0 ) to find an adversarial example. The perturbations are applied on patches of x 1 , as described in Appendix A. The random perturbations are on the scale of 2e-7, corresponding to the input perturbations that cause a change in Figure 1a .

4.3. EXPERIMENTS

We conduct our experiments on a workstation equipped with two GPUs (NVIDIA Titan RTX and NVIDIA GeForce RTX 2070 SUPER), 128 GiB of RAM and an AMD Ryzen Threadripper 2970WX  L CW NN C, M L CW H NN C, C L CW H NN G, M L CW H NN G, C L CW H NN G, CWG L CW H DLUSODQH L CW KRUVH L CW H KRUVH L CW H KRUVH L CW H KRUVH L CW H KRUVH L CW H KRUVH L CW GHHU L CW H GHHU L CW H GHHU L CW H GHHU L CW H GHHU L CW H Figure 3 : The quasi-safe images with respect to which all implementations are successfully attacked, and corresponding adversarial images 24-core processor. We train the small architecture from Xiao et al. (2019) with the PGD adversary and the RS Loss on MNIST and CIFAR10 datasets. The trained networks achieve 94.63% and 44.73% provable robustness with perturbations of ∞ norm bounded by 0.1 and 2/255 on the two datasets respectively, similar to the results reported in Xiao et al. (2019) . Our code will be made publicly available after the review process. Although our method only needs O(-log ) invocations of the verifier where is the gap in the binary search, the verifier is too slow to run a large benchmark in a reasonable time. Therefore, for each dataset we only test our method on 32 images randomly sampled from the verifiably robustly classified test images. The time limit of MILP solving is 360 seconds. Out of these 32 images, we have successfully found quasi-adversarial images (x 1 from Section 4.2 Step 2, where failed cases are solver timeouts) for 18 images on MNIST and 26 images on CIFAR10. We apply random perturbations to these quasi-adversarial images to obtain adversarial images within the perturbation range of the quasi-safe image (x 0 = αx seed from Section 4.2 Step 1). All the implementations that we have considered are successfully attacked. We present the detailed numbers in Table 1 . We also present in Figure 3 the quasi-safe images on which our attack method succeeds for all implementations and the corresponding adversarial images.

5. EXPLOITING AN INCOMPLETE VERIFIER

The relaxation adopted in certification methods renders them incomplete but also makes their verification claims more robust to floating point error compared to complete verifiers. In particular, we evaluate the CROWN framework (Zhang et al., 2018) on our randomly selected test images and corresponding quasi-safe images from Section 4.3. CROWN is able to verify the robustness of the network on 29 out of the 32 original test images, but it is unable to prove the robustness for any of the quasi-safe images. Note that MIPVerify claims that the network is robust with respect to all the original test images and corresponding quasi-safe images. Given the above situation, we demonstrate that incomplete verifiers are still prone to floating point error. We build a neural network that takes a 13 × 13 single-channel input image, followed by a 5 × 5 convolutional layer with a single output channel, two fully connected layers with 16 output neurons each, a fully connected layer with one output neuron denoted as u = max(W u h u + b u , 0), and a final linear layer that computes y = [u, 1e -7] as the logits vector. All the hidden layers have ReLU activation. The input x 0 is taken from a Gaussian distribution. The hidden layers have random Gaussian coefficients, and the biases are chosen so that (i) the ReLU neurons before u are always activated for inputs in the perturbation space of x 0 , (ii) u = 0 always holds for these inputs, and (iii) b u is maximized with all other parameters fixed. CROWN is able to prove that all ReLU neurons before u are always activated but u is never activated, and therefore it claims that the network is robust with respect to perturbations around x 0 . However, by initializing the quasi-adversarial input x 1 ← x 0 + sign(W equiv ) where W equiv is the product of all the coefficient matrices of the layers up to u, we successfully find adversarial inputs for all the five implementations considered in this work by randomly perturbing x 1 in a way similar to Step 3 of Section 4.2.

6. DISCUSSION

We agree with the security expert Window Snyder, "One single vulnerability is all an attacker needs". Unfortunately, most previous work on neural network verification abstains from discussing possible vulnerabilities in their methods. We have demonstrated that neural network verifiers, although meant to provide security guarantees, are systematically exploitable. The underlying tradeoff between soundness and scalability in the verification of floating point programs is fundamental but has not received enough attention in the neural network verification literature. One appealing remedy is to introduce floating point error relaxations into complete verifiers, such as by verifying for a larger or setting a threshold for accepted confidence score. However, a tight and sound relaxation is extremely challenging to find. We are unaware of prior attempt to formally prove error bounds for practical and accelerated neural network implementations or verifiers. Some incomplete verifiers have incorporated floating point error by maintaining upper and lower rounding bounds of internal computations (Singh et al., 2018; 2019) , which is also potentially applicable to complete verifiers. However, this approach relies on the specific implementation details of the inference algorithm -optimizations such as Winograd (Lavin & Gray, 2016) or FFT (Abtahi et al., 2018) would either invalidate the robustness guarantees or require changes to the analysis algorithm. Another approach is to quantize the computation to align the inference implementation with the verifier. For example, if we require all activations to be multiples of s 0 and all weights to be multiples of s 1 , where s 0 s 1 > 2E and E is a very loose bound of possible implementation error, then the output can be rounded to multiples of s 0 s 1 to completely eliminate numerical error. Binarized neural networks (Hubara et al., 2016) are a family of extremely quantized networks, and their verification (Narodytska et al., 2018; Shih et al., 2019) is sound and complete. However, the problem of robust training and verification of quantized neural networks (Jia & Rinard, 2020) is relatively under-examined compared to that of real-valued neural networks (Madry et al., 2018; Mirman et al., 2018; Tjeng et al., 2019; Xiao et al., 2019) .

7. CONCLUSION

Floating point error should not be overlooked in the verification of real-valued neural networks, as we have presented techniques that construct adversarial examples for neural networks claimed to be robust by a verifier. We hope our results will help to guide future neural network verification research by providing another perspective for the tradeoff between soundness, completeness, and scalability.



•; W ): A matrix multiplication based implementation on x86/64 CPUs. The convolution kernel is copied into a matrix that describes the dot product to be applied on the flattened input for each output value. • NN C,C (•; W ): The default convolution implementation on x86/64 CPUs. • NN G,M (•; W ): A matrix multiplication based implementation on NVIDIA GPUs. • NN G,C (•; W ): A convolution implementation using the IMPLICIT_GEMM algorithm from the cuDNN library (Chetlur et al., 2014) on NVIDIA GPUs. • NN G,CWG (•; W ): A convolution implementation using the WINOGRAD_NONFUSED algorithm from the cuDNN library (Chetlur et al., 2014) on NVIDIA GPUs. It is based on the Winograd fast convolution algorithm

b) Distribution of difference relative to NNC,M of first layer evaluated on MNIST test images. This plot shows that different implementations usually exhibit different floating point error characteristics.

Figure 1: Empirical characterization of numerical error of different implementations

Figure 2: Illustration of our method. Since the verifier does not model the floating point arithmetic details of the implementation, their decision boundaries for the classification problem diverge, which allows us to find adversarial inputs by crossing the boundary via numerical error fluctuations. Note that the verifier usually does not comply with a well defined specification of NN (•; W ), and therefore it does not define a decision boundary. The dashed boundary in the diagram is just for illustrative purposes.

Number of successful adversarial attacks for different neural network implementations. The number of quasi-adversarial images in the first column corresponds to the cases where the solver does not time out at the initialization step. For each implementation, we try to find adversarial images by applying random perturbations on each quasi-adversarial image and report the number of successfully found adversarial images here. #quasi-adv / #tested NN C,M NN C,C NN G,M NN G,C NN G,CWG

A RANDOM PERTURBATION ALGORITHM

We present the details of our random perturbation algorithm below. Note that the Winograd convolution computes a whole output patch in one iteration, and therefore we handle it separately in the algorithm.Input: quasi-safe image x 0 Input: target class number t Input: quasi-adversarial image x 1 Input: input perturbation bound Input: a neural network inference implementation NN impl (•; W ) Input: number of iterations N (default value 1000) Input: perturbation scale u (default value 2e-7) Output: an adversarial image x adv or FAILED for Index i of x 0 do Find the weakest bounds x l and x u for allowed perturbationsThe Winograd algorithm in cuDNN produces 9 × 9 output tiles for 13 × 13 input tiles and 5 × 5 kernels. The offset and stride here ensure that perturbed tiles contribute independently to the output. else (offset, stride) ← (0, 4) Work on small tiles to avoid random errors get cancelled end if for i ← 1 to N do for (h, w) ← (0, 0) to (height(x 1 ), width(x 1 )) step (stride, stride) do δ ← uniform(-u, u, (stride -offset, stride -offset)) 

