TOWARDS NATURAL ROBUSTNESS AGAINST ADVERSARIAL EXAMPLES

Abstract

Recent studies have shown that deep neural networks are vulnerable to adversarial examples, but most of the methods proposed to defense adversarial examples cannot solve this problem fundamentally. In this paper, we theoretically prove that there is an upper bound for neural networks with identity mappings to constrain the error caused by adversarial noises. However, in actual computations, this kind of neural network no longer holds any upper bound and is therefore susceptible to adversarial examples. Following similar procedures, we explain why adversarial examples can fool other deep neural networks with skip connections. Furthermore, we demonstrate that a new family of deep neural networks called Neural ODEs (Chen et al., 2018) holds a weaker upper bound. This weaker upper bound prevents the amount of change in the result from being too large. Thus, Neural ODEs have natural robustness against adversarial examples. We evaluate the performance of Neural ODEs compared with ResNet under three white-box adversarial attacks (FGSM, PGD, DI 2 -FGSM) and one black-box adversarial attack (Boundary Attack). Finally, we show that the natural robustness of Neural ODEs is even better than the robustness of neural networks that are trained with adversarial training methods, such as TRADES and YOPO.

1. INTRODUCTION

Deep neural networks have made great progress in numerous domains of machine learning, especially in computer vision. But Szegedy et al. (2013) found that most of the existing state-of-theart neural networks are easily fooled by adversarial examples that generated by putting only very small perturbations to the input images. Since realizing the unstability of deep neural networks, researchers have proposed different kinds of methods to defense adversarial examples, such as adversarial training (Goodfellow et al., 2014 ), data compression (Dziugaite et al., 2016) , and distillation defense (Papernot et al., 2016) . But each of these methods is a remedy for the original problem, and none of these methods can solve it fundamentally. For example, Moosavi-Dezfooli et al. (2016) showed that no matter how much adversarial examples are added to training sets, there are new adversarial examples that can successfully attack the adversarial trained deep neural network. So, avoiding adversarial examples technically cannot solve the most essential problem: why such subtle change in adversarial examples can beat deep neural networks? Meanwhile, it leads to a more important question: how to make deep neural networks have natural robustness so that they can get rid of malicious adversarial examples. Early explanations for adversarial examples considered that a smoothness prior is typically valid for kernel methods that imperceptibly tiny perturbations of a given image do not normally change the underlying class, while the smoothness assumption does not hold for deep neural networks due to its high non-linearity (Szegedy et al., 2013) . This analysis underlies plain deep neural networks like AlexNet (Krizhevsky et al., 2012) . But later than that, Goodfellow et al. (2014) claim adversarial examples are a result of models being too linear rather than too non-linear, they can be explained as a property of high-dimensional dot products. Unfortunately, both of these explanations seem to imply that adversarial examples are inevitable for deep neural networks. On the other hand, we notice that skip connections are widely used in current deep neural networks after the appearance of Highway Network (Srivastava et al., 2015) and ResNet (He et al., 2016) . It turns out that the identity mapping in ResNet is formally equivalent to one step of Euler's method which has been used to solve ordinary differential equations (Weinan, 2017) . More than that, other kinds of skip connections used by different network architectures can be considered as different numerical methods for solving ordinary differential equations. The link between numerical ordinary differential equations with deep neural networks can bring us a whole new perspective to explain adversarial examples through the numerical stability analysis. In this paper, we attempt to utilize the natural property of neural networks to defense adversarial examples. We first analyze how adversarial examples affect the output of neural networks with identity mappings, obtain an upper bound for this kind of neural networks, and find that this upper bound is impractical in actual computations. In the same way, we figure out why adversarial examples can fool commonly used deep neural networks with skip connections. Then, we demonstrate that Neural ODEs hold a weaker upper bound and verify the natural robustness of Neural ODEs under four types of perturbations. Finally, we compare Neural ODEs with three types of adversarial training methods to show that the natural robustness of Neural ODEs is better than the robustness of neural networks that are trained with adversarial training. The main contributions of our work are as follows: • We introduce and formalize the numerical stability analysis for deep neural networks with identity mappings, prove that there is an upper bound for neural networks with identity mappings to constrain the error caused by adversarial noises. Modifying the neural networks by using auto encoders, input gradient regularization and distillation can result in robustness against adversarial attacks (Bai et al., 2017; Ross & Doshi-Velez, 2017; Papernot et al., 2016) . Besides, there are some biologically inspired deep learning models designed to have natural robustness against adversarial examples. Nayebi & Ganguli (2017) developed a scheme similar to nonlinear dendritic computation to train deep neural networks to make them robust to adversarial attacks. Krotov & Hopfield (2018) proposed Dense Associative Memory (DAM) models and suggested that DAM with higher order energy functions are closer to human visual perception than deep neural networks with ReLUs. In addition to augmenting datasets or modifying original neural networks, there exist adversarial defense methods that rely on using external models and detecting adversarial examples. Akhtar et al. (2018) presented Perturbation Rectifying Network (PRN) as 'pre-input' layers to a targeted model, if a perturbation is detected, the output of the PRN is used for label prediction instead of the actual image. Xu et al. (2017) proposed a strategy called feature squeezing to reduce the search space available to an adversary by coalescing samples that correspond to many different feature vectors in the original space into a single sample.

3.1. RESNET AND EULER'S METHOD

The building block of ResNet is defined as y n+1 = y n + f (y n ; θ n ) Where n ∈ {0...N (h) -1}, y n and y n+1 are input and output vectors of the layers considered. The function f (y n ; θ n ) represents the residual mapping and θ n are weights to be learned. In numerical methods for solving ordinary differential equations, Euler's method is defined by taking this to be exact: y n+1 = y n + hf (t n , y n ; θ n ) (2) It can be easily seen that Eqn.( 1) is a special case of Euler's method when the step size h = 1. The iterative updates of ResNet can be seen as an Euler discretization of a continuous transformation (Lu et al., 2018; Haber & Ruthotto, 2017; Ruthotto & Haber, 2019) . When the input data is perturbed by adversarial noises , we denote the adversarial example by z 0 = y 0 + To perform stability analysis for ResNet with adversarial example z 0 as input, we define a numerical solution z n by z n+1 = z n + hf (t n , z n ; θ n ) ) and provide following theorem to show that the amount of change between y n and z n holds an upper bound by some assumptions. Theorem 3.1. Let f (t, y; θ) be a continuous function for t 0 ≤ t ≤ b and -∞ < y < ∞, and further assume that f (t, y; θ) satisfies the Lipschitz condition. Then compare the two numerical solutions y n and z n as h → 0, there is a constant c ≥ 0, such that amount of change between y n and z n satisfies max 0≤n≤N (h) |z n -y n | ≤ c| | (5) Proof. Let e n = z n -y n , n ≥ 0. Then e 0 = , and subtracting Eqn. (2) from Eqn.(4), we obtain e n+1 = e n + h[f (t n , z n ; θ n ) -f (t n , y n ; θ n )] Assume that the derivative function f (t, y; θ) satisfied the following Lipschitz condition: there exists K ≥ 0 such that |f (t, y 1 ; θ) -f (t, y 2 ; θ)| ≤ K|y 1 -y 2 | (7) Taking bounds using Eqn.(7), we obtain |e n+1 | ≤ |e n | + h K|z n -y n | (8) |e n+1 | ≤ (1 + h K)|e n | (9) where K is the largest Lipschitz constant. Apply this recursively to obtain |e n | ≤ (1 + h K) n |e 0 | 10) Using the inequality (1 + t) m ≤ e mt , for any t ≥ -1, any m ≥ 0, we obtain (1 + hK) n ≤ e nh K = e (b-t0) K (11) and this implies the main result Eqn.(5). Roughly speaking, theorem 3.1 means that a small adversarial perturbation initial value of the problem leads to a small change in the solution, provided that the function f (t, y; θ) is continuous and the step size h is sufficiently small. Obviously, ResNet does not satisfied the assumption of continuity and its step size h always equal to 1. Particularly, the functions f (t, y; θ) in ResNet are composite function which contains ReLU activation functions. Although ReLU activation functions g(x) = max(0, x) break the continuity, due to their contractive property, i.e. satisfies g(x) -g(x + ) ≤ | | for all x, ; it follows that g n (x; θ n ) -g n (x + ; θ n ) = max(0, θ n x + b n ) -max(0, θ n (x + ) + b n )) ≤ θ n ≤ θ n || This provides a weaker upper bound for ReLU to mitigate the loss of continuity. Even if the step size h in ResNet can be adjusted by multiply outputs of convolution layers by a chosen constant, but unfortunately, the step size h in ResNet cannot be too small since a very small step size decreases the efficiency in actual computations. We can experimentally show that when step size h is small (such as 10 -1 , 10 -2 and 10 -3 ), ResNet has no obvious robustness against adversarial examples, and when step size is very small (such as 10 -8 , 10 -9 , 10 -10 ), ResNet is difficult to train, so not only the classification accuracy of adversarial examples but also the accuracy of clean examples is quite low. To summarize, the main reason for ResNet's failure in adversarial examples is the step size h = 1 destroys the upper bound given by Eqn.( 5) and we cannot find a proper way to deal with it. Thus, when the input of ResNet is perturbed by adversarial noises, the amount of change in the result will become unpredictably large so that ResNet can no longer correctly classify the input.

3.2. NEURAL ODES

The residual block can be described as y n+1 = y n + hf (y n ; θ n ) with step size h = 1. When taking smaller steps and add more layers, which in the limit, Neural ODEs parameterize the continuous dynamics of hidden units using an ordinary differential equation specified by a neural network lim h→0 y n+h -y n h = dy dh = f (t, y; θ) The solution of Eqn.( 13) can be computed using modern ODE solvers such as Runge-Kutta (Runge, 1895; Kutta, 1901 ). Euler's method only uses f (t n , y n ; θ n ), which means from any point on a curve, we can find an approximation of a nearby point on the curve by moving a short distance along a tangent line to the curve. Instead of using a single tangent line, modern numerical ODE methods often use lots of different tangent lines and weighted sum all of them to generate a super tangent line, we denote the super tangent line by F (t n , y n ; θ n , f ). We can show that modern ODE solvers based on Runge-Kutta are stable under some assumptions, the amount of change between y n and z n holds an upper bound similar to theorem 3.1. Theorem 3.2. Let f (t, y; θ) be a continuous function for t 0 ≤ t ≤ b and -∞ < y < ∞, and further assume that f (t, y; θ), F (t, y; θ, f ) satisfies the Lipschitz condition. Then compare the two numerical solutions y n and z n as h → 0, there is a constant c ≥ 0, such that amount of change between y n and z n satisfies max 0≤n≤N (h) |z n -y n | ≤ c| | Proof. This result can be obtained in analogy with Eqn.(5) in theorem 3.1. The neural network in Neural ODEs also use ReLU activation functions. And we already show that Eqn.( 12) provides a weaker upper bound for ReLU. Besides, modern ODE solvers, such as Fehlberg method (Fehlberg, 1969) and DOPRI method (Dormand & Prince, 1980) , can adaptively adjust the step size until that the desired tolerance is reached. At each step, two different approximations (for example, a fourth order Runge-Kutta and a fifth order Runge-Kutta) for the solution are made and compared. If the two answers are in close agreement, the approximation is accepted. If the two answers do not agree to a specified accuracy, the step size is reduced. If the answers agree to more significant digits than required, the step size is increased. So modern ODE solvers provide a weaker upper bound for step size when it does not agree with h → 0. To summarize, although Neural ODEs cannot hold the strong upper bound provided by Eqn.( 14), it still holds a weaker upper bound to constrain the error caused by adversarial noises. We will experimentally show that Neural ODEs is more robust on adversarial examples compared with ResNet.

3.3. OTHER DEEP NEURAL NETWORKS WITH SKIP CONNECTIONS

Following similar procedures, we can figure out the reason why other deep neural networks with skip connections cannot resist adversarial examples.

3.3.1. POLYNET

PolyNet (Zhang et al., 2017) introduced a family of models called PolyInception. Each PolyInception module is a polynomial combination of Inception units that can be described as I + F + F 2 • y = y + F (y) + F (F (y)) Where y denotes the input, I the identity operator, and F the nonlinear transform carried out by the residual block, which can also be considered as an operator. It has been shown that Eqn.( 15) can be interpreted as an approximation to one step of the backward (or implicit) Euler method (Lu et al., 2018) : y n+1 = y n + hf (t n+1 , y n+1 ) And from Eqn.( 16) we can get y n+1 = (I -hf ) -1 y n (17) Where (I -hf ) -1 can be formally rewritten as I + hf + (hf ) 2 + • • • + (hf ) n + • • • So, there are two differences between the polynomial combination with the backward Euler method: • The polynomial combination is only a second order approximation to the backward Euler method, the truncation error generated when high order terms are ignored. • PolyNet faces the same problem with ResNet that the step size h = 1. Thus, PolyNet cannot holds the upper bound given by the backward Euler method. And step size h = 1 is the main reason that PolyNet fails to resist adversarial examples.

3.3.2. FRACTALNET

FractalNet (Larsson et al., 2016) introduced a design strategy for neural network macro-architecture based on self-similarity. The expansion rule generates a fractal architecture with C intertwined columns can be described as f C+1 (y) = 1 2 (f C • f C )(y) + 1 2 f 1 (y) Where f 1 (y) = conv(y) and • denotes composition. This expansion rule resembles to the Runge-Kutta method of order 2 (also known as Heun's method) (Lu et al., 2018 ) y n+1 = y n + 1 2 h(f (t n , y n ) + f (t n+1 , y n + hf (t n , y n ))) So, there are two differences between the expansion rule with Heun's method: • The expansion rule losses two identity mapping, namely y n , compared to Heun's method. • FractalNet faces the same problem with ResNet that the step size h = 1. Thus, FractalNet cannot holds the upper bound given by Heun's method. And step size h = 1 is the main reason that FractalNet fails to resist adversarial examples.

3.3.3. REVNET

RevNet (Gomez et al., 2017) introduced a variant of ResNet where each layer's activations can be reconstructed exactly from the next layer's. Each reversible block takes inputs and produces outputs according to the following additive coupling rules X n+1 = X n + f 1 (Y n ) Y n+1 = Y n + f 2 (X n+1 ) This additive coupling rules can be interpreted as Euler's method applies to the following systems of differential equations (Lu et al., 2018 ) Ẋ = f 1 (Y, t) Ẏ = f 2 (X, t) RevNet faces the same problem with ResNet that the step size h = 1. Thus, RevNet cannot holds the upper bound given by Euler's method for the dynamic systems.

4.1. RESNET WITH STEP SIZE h = 1

We first evaluate the performance of ResNets with different step sizes on adversarial examples. Adopting the same hyperparameter settings for ResNet56 that presented in He et al. (2016) , we train 11 ResNet56s on CIFAR10 datasets with the step size decreases from 10 -0 to 10 -10 . Through the numerical stability analysis of ResNet, we know that the classification result should be well behaved when considering small adversarial noises, provided that the step size of h is sufficiently small. In actual computations, however, when we decrease the step size from 10 -1 to 10 -10 , ResNet has no obvious robustness against adversarial examples. And when step size is very small, ResNet is difficult to train. Therefore, the accuracy of ResNet56 is considerably low both on clean and adversarial inputs. We consider that ResNet is impossible to have natural robustness against adversarial examples.

4.2. THE NATURAL ROBUSTNESS OF NEURAL ODES

As we mentioned in 3.2, Neural ODEs define a continuous dynamic system, which is equivalent to ResNet with step size h → 0. And Neural ODEs hold a weaker upper bound to constrain the error caused by adversarial noises. In this section, we evaluate the performance of Neural ODEs under three white-box attacks, FGSM (Goodfellow et al., 2014) , PGD (Madry et al., 2017) , DI 2 -FGSM (Xie et al., 2019) and one black-box adversarial attack, Boundary Attack (Brendel et al., 2017) . Without data augmentation, the accuracy of ResNet on clean MNIST and CIFAR10 datasets is 99.33% and 89.60%, the accuracy of Neural ODEs on clean MNIST and CIFAR10 datasets is 99.05% and 69.94%. Experiment details can be seen in appendix A. For both MNIST and CIFAR10, we set the size of perturbation of PGD in an infinite norm sense, the size of PGD step is set to 0.01, the number of PGD steps is set to 40 and a uniform random perturbation is added before performing PGD. As shown in Figure 2 , when facing adversarial noises generated by DI 2 -FGSM, for MNIST, Neural ODEs remains strongly resistant to perturbations while the accuracy of ResNet drops sharply with larger . For CIFAR10, the reduction in accuracy of Neural ODEs is significantly less than ResNet50 with an increase of . Overall, Neural ODEs is more stable when facing white-box adversarial noises compared with ResNet.

4.2.2. THE EVALUATION ON BLACK-BOX ADVERSARIAL ATTACKS

Boundary Attack is a decision-based attack that starts from a large adversarial perturbation and then seeks to reduce the perturbation while staying adversarial (Brendel et al., 2017) . For MNIST, Boundary Attack has almost no effect on Neural ODEs. For CIFAR10, however, it seems that Neural ODEs also fails in this gradient-free attack, but the performance of Neural ODEs is still better than ResNet50 when is larger than 33. 

5. CONCLUSION AND DISCUSSION

In this paper, we highlighted and analyzed the natural robustness of Neural ODEs. We proved there are two similar upper bounds for ResNet and Neural ODEs under assumptions of continuity and step size h → 0. We showed that it is the step size h = 1 causes ResNet and the other three deep neural networks with skip connections fail on adversarial examples. Neural ODEs define a continuous dynamic system with step size h → 0. However, in actual computations, Modern ODE solvers adaptively adjust the step size to solve the continuous dynamic system and this brings us a weaker upper bound for Neural ODEs. We experimentally showed that Neural ODEs is more robust on adversarial examples compared with ResNet. According to Theorem 3.1 and 3.2, the upper bound c| | and c| | are related to Lipschitz constant and independent with the step size h. Cisse et al. (2017) showed that the robustness of DNNs can be improved by constraining the Lipschitz constant. In this paper, however, we are more concerned with the step size h, because the upper bound cannot be guaranteed if h → 0 is not satisfied. We experimentally demonstrated that the difference of ResNet with Neural ODEs in step size h is sufficient to make a significant difference in robustness, even without constraining the Lipschitz constant. Nevertheless, constraining the Lipschitz constant is expected to make neural networks to be more robust, which we would like to consider as our future work. Due to the similarity between DNNs and numerical methods for solving ODEs, we believe that we can learn from the numerical stability analysis to design more robust deep neural networks. A TRAINING CONFIGURATION OF NEURAL ODES 



Figure 1: The classification accuracy of 11 ResNet56s both on clean CIFAR10 images and three kinds of adversarial CIFAR10 images(on the log scale). The adversarial CIFAR10 images are generated by Fast Gradient Sign Method with = 0.1, 0.3, 0.5 on another neural network which has three residual blocks.

Figure 2: The evaluation of ResNet and Neural ODEs under DI 2 -FGSM. = 0 represents the accuracy of ResNet and Neural ODEs without the adversarial attack applied. For MNIST, the maximum perturbation is set to 0.5 among all experiments, with pixel value in [0,1]. For CIFAR10, the maximum perturbation is set to 50 among all experiments, with pixel value in [0,255].

Compare with ResNet and three types of adversarial training methods, we show the natural robustness of Neural ODEs.

shows that Neural ODEs have natural robustness under FGSM and PGD attacks. The classification accuracy of adversarial MNIST images generated by FGSM only drops by less than 2% for all . Even the strongest first order attack method PGD can hardly beat Neural ODEs on MNIST. Although the performance of Neural ODEs on clean CIFAR10 images is worse than state-of-the-art networks like ResNet, it still works well on adversarial CIFAR10 images generated by PGD.

The classification accuracy of Neural ODEs and ResNet50 under FGSM and PGD attacks.

The evaluation of ResNet and Neural ODEs under Boundary Attack. For Boundary Attack, the number of queries is fixed at 1,000. The initial step size and orthogonal step size are set to 0.1.4.2.3 COMPARING WITH ADVERSARIAL TRAINING METHODSAs one can see, Neural ODEs can remain its robustness and outperform PGD, TRADES, and YOPO without taking any adversarial training methods. The comparison of robustness between Neural ODEs with PreAct-Res18 trained by three types of adversarial training methods on CIFAR10.

summarizes our experiment settings for training Neural ODEs. We only change some hyperparameters in Neural ODEs, without taking any help of adversarial defense methods.

The training configurations of Neural ODEs on MNIST and CIFAR10. *Every 50 epochs, the learning rate is reduced to half.It is worth to mention that there are no common training configurations for Neural ODEs on CI-FAR10,Dupont et al. (2019) had reported that the accuracy of Neural ODEs on CIFAR10 test sets is 53.7%±0.2%, our experiment result is better than it.

