EXPLORING AND EXPLOITING DECISION BOUNDARY DYNAMICS FOR ADVERSARIAL ROBUSTNESS

Abstract

The robustness of a deep classifier can be characterized by its margins: the decision boundary's distances to natural data points. However, it is unclear whether existing robust training methods effectively increase the margin for each vulnerable point during training. To understand this, we propose a continuous-time framework for quantifying the relative speed of the decision boundary with respect to each individual point. Through visualizing the moving speed of the decision boundary under Adversarial Training, one of the most effective robust training algorithms, a surprising moving-behavior is revealed: the decision boundary moves away from some vulnerable points but simultaneously moves closer to others, decreasing their margins. To alleviate these conflicting dynamics of the decision boundary, we propose Dynamics-Aware Robust Training (DyART), which encourages the decision boundary to engage in movement that prioritizes increasing smaller margins. In contrast to prior works, DyART directly operates on the margins rather than their indirect approximations, allowing for more targeted and effective robustness improvement. Experiments on the CIFAR-10 and Tiny-ImageNet datasets verify that DyART alleviates the conflicting dynamics of the decision boundary and obtains improved robustness under various perturbation sizes compared to the state-of-the-art defenses. Our code is available at https://github.com/ Yuancheng-Xu/Dynamics-Aware-Robust-Training. To answer the above question, we propose a continuous-time framework that quantifies the instantaneous movement of the decision boundary as shown in Figure 1c . Specifically, we define the relative speed of the decision boundary w.r.t. a point to be the time derivative of its margin, which can be interpreted as the speed of its closest adversarial example moving away from it. We show that the speed can be derived from the training algorithm using a closed-form expression. Using the proposed framework, we empirically compute the speed of the decision boundary w.r.t. data points for AT. As will be shown in Figure 3 , the aforementioned conflicting dynamics of the decision boundary (Figure 1b, 1c ) is revealed: the decision boundary moves towards many vulnerable points during training and decrease their margins, directly counteracting the objective of robust training. The desirable dynamics of the decision boundary, on the other hand, should increase the margins of all vulnerable points. This leads to another question: Question 2 How to design algorithms that encourage the decision boundary to engage in movements that increase margins for vulnerable points, and not decrease them? To this end, we propose Dynamics-Aware Robust Training (DyART), which prioritizes moving the decision boundary away from more vulnerable points and increasing their margins. Specifically, DyART directly operates on margins of training data and carefully designs its cost function on margins for more desirable dynamics. Note that directly optimizing margins in the input space is technically challenging since it was previously unclear how to compute the gradient of the margin. In this work, we derive the closed-form expression for the gradient of the margin and present an efficient algorithm to compute it, making gradient descent viable for DyART. In addition, since DyART directly operates on margins instead of using a pre-defined uniform perturbation bound for training as in AT, DyART is naturally robust for a wide range of perturbation sizes ϵ. Experimentally, we demonstrate that DyART mitigates the conflicting dynamics of the decision boundary and achieves improved robustness performance on diverse attacking budgets. We propose a continuous-time framework to study the relative speed of the decision boundary w.r.t. each individual data point and provide a closed-form expression for the speed. (2) We visualize the speed of the decision boundary for AT and identify the conflicting dynamics of the decision boundary. (3) We present a close-form expression for the gradient of the margin, allowing for direct manipulation of the margin. (4) We introduce an efficient alternative to compute the margin gradient by replacing the margin with our proposed soft margin, a lower bound of the margin whose approximation gap is controllable. (5) We propose Dynamics-Aware Robust Training (DyART), which alleviates the conflicting dynamics by carefully designing a cost function on soft margins to prioritize increasing smaller margins. Experiments show that DyART obtains improved robustness over state-of-the-art defenses on various perturbation sizes. Decision boundary analysis. Prior works on decision boundary of deep classifiers have studied the small margins in adversarial directions (Karimi et al., 2019) , the topology of classification regions Continuous-time formulation. To study the instantaneous movement of the decision boundary in Section 4, we will use the continuous-time formulation for the optimization on the parameters θ, denoted as θ(t). Let θ ′ (t) be the continuous-time description of the update rule of the model parameters. When using gradient descent on a loss function L, we have θ ′ (t) = -∇ θ L(θ(t)). In this section, we will study the dynamics of the decision boundary in continuous time. We first define its speed w.r.t. each data point, and then provide a closed-form expression for it. Finally, we visualize the speed of the decision boundary under Adversarial Training and analyze its dynamics.

1. INTRODUCTION

Deep neural networks have exhibited impressive performance in a wide range of applications (Krizhevsky et al., 2012; Goodfellow et al., 2014; He et al., 2016a) . However, they have also been shown to be susceptible to adversarial examples, leading to issues in security-critical applications such as autonomous driving and medicine (Szegedy et al., 2013; Nguyen et al., 2015) . To alleviate this problem, adversarial training (AT) (Madry et al., 2017; Shafahi et al., 2019; Zhang et al., 2019; Gowal et al., 2020) was proposed and is one of the most prevalent methods against adversarial attacks. Specifically, AT aims to find the worst-case adversarial examples based on some surrogate loss and adds them to the training dataset in order to improve robustness. Despite the success of AT, it has been shown that over-parameterized neural networks still have insufficient model capacity for fitting adversarial training data, partly because AT does not consider the vulnerability difference among data points (Zhang et al., 2021) . The vulnerability of a data point can be measured by its margin: its distance to the decision boundary. As depicted in Figure 1a , some data points have smaller margins and are thus more vulnerable to attacks. Since AT does not directly operate on the margins and it uses a pre-defined perturbation bound for all data points regardless of their vulnerability difference, it is unclear whether the learning algorithm can effectively increase the margin for each vulnerable point. Geometrically, we would like to know if the decision boundary moves away from the data points, especially the vulnerable ones. As illustrated in Figure 1b , there can exist conflicting dynamics of the decision boundary: it moves away from some vulnerable points but simultaneously moves closer to other vulnerable ones during training. This motivates us to ask: Question 1 Given a training algorithm, how can we analyze the dynamics of the decision boundary with respect to the data points? (a) The decision boundary, vulnerable (solid) and robust (hollow) points. Step Step

Less Robust

More Robust (b) An update with conflicting impacts on robustness.

Continuous movement

Step (c) Continuous movement of the decision boundary. Figure 1 : The movement of the decision boundary. Red triangles and green circles are data points from two classes. Figure 1a shows the vulnerability difference among the data points: some are closer to the decision boundary, whereas others are farther from it. In Figure 1b , the decision boundary after an update moves away from some vulnerable points (made more robust) but simultaneously moves closer to other vulnerable ones (made less robust). Figure 1c describes the continuous movement of the decision boundary in Figure 1b . (Fawzi et al., 2018) , the relationship between dataset features and margins (Ortiz-Jimenez et al., 2020) and improved robust training by decreasing the unwarranted increase in the margin along adversarial directions (Rade & Moosavi-Dezfooli, 2022) . While these works study the static decision boundary of trained models, our work focuses on the decision boundary dynamics during training. Weighted adversarial training. Adversarial training and its variants (Madry et al., 2017; Zhang et al., 2019; Wang et al., 2019; Zhang et al., 2020b) have been proposed to alleviate the adversarial vulnerability of deep learning models. To better utilize the model capacity, weighted adversarial training methods are proposed (Zeng et al., 2020; Liu et al., 2021; Zhang et al., 2021) aiming to assign larger weights to more vulnerable points closer to the decision boundary. However, these methods rely on indirect approximations of margins that are not optimal. For example, GAIRAT (Zhang et al., 2021) uses the least number of iterations needed to flip the label of an clean example as an surrogate to its margin, which is shown to be likely to make wrong judgement on the robustness (Liu et al., 2021) . As another approximation, the logit margin (Liu et al., 2021; Zeng et al., 2020) is used but larger logit margin values do not necessarily correspond to larger margins. In contrast, our proposed DyART directly uses margins to characterize the vulnerability of data points. Margin maximization. Increasing the distance between the decision boundary and data points has been discussed in prior works. The authors of Elsayed et al. (2018) propose to maximize the first order Taylor's expansion approximation of the margin at the clean data point, which is inaccurate and computationally prohibitive since it requires computing the Hessian of the classifier. The authors of Atzmon et al. (2019) propose to maximize the distance between each data point and some point on the decision boundary, which is not the closest one and thus does not increase the margin directly. MMA (Ding et al., 2020) uses the uniform average of cross-entropy loss on the closest adversarial examples as the objective function, indirectly increasing the average margins. All of these methods maximize the average margin indirectly and do not consider the vulnerability differences among points. In contrast, our proposed DyART will utilize our derived closed-form expression for margin gradient to directly operate on margins and moreover, prioritize increasing smaller margins.

4.1. SPEED OF THE DECISION BOUNDARY

Consider a correctly classified clean example (x i , y i ). Our goal is to capture the movement of the decision boundary Γ yi (t) = {x : ϕ yi (x, θ(t)) = 0} w.r.t. x i as t varies continuously. To this end, we consider the curve of the closest boundary point xi (t) on Γ yi (t) to x i : Definition 1 (Curve of the closest boundary point xi (•)). Suppose that (x i , y i ) is correctly classified by f θ(t) in some time interval I. Define the curve of the closest boundary point xi (•) : I → X as xi (t) = arg min x ∥x -x i ∥ p s.t. ϕ yi (x, θ(t)) = 0. (3) Define the margin of x i at time t to be R(x i , t) = ∥x i (t) -x i ∥ p . An example of the curve of the closest boundary point is depicted in Figure 2 . To understand how the distance between the decision boundary Γ yi (t) and x i changes, it suffices to focus on the curve of the closest boundary point xi (t). We define the speed of the decision boundary to be the time derivative of the margin as follows: Definition 2 (Speed of the decision boundary s(x i , t)). Under the setting of definition 1, define the speed of the decision boundary w.r.t. x i as s(x i , t) = d dt R(x i , t) = d dt ∥x i (t) -x i ∥ p . Note that the speed s(x i , t) > 0 means that the robustness is improving for x i at time t, which is desirable during robust training. The following proposition gives a closed-form expression for the speed, given a training algorithm θ ′ (t). Proposition 3 (Closed-form expression of the speed s(x i , t)). Let xi (t) be the curve of the closest boundary point w.r.t. x i . For 1 ≤ p ≤ ∞, the speed of decision boundary w.r.t. x i under ℓ p norm is s(x i , t) = 1 ∥∇ x ϕ yi (x i (t), θ(t))∥ q ∇ θ ϕ yi (x i (t), θ(t)) • θ ′ (t) where q satisfies that 1/q + 1/p = 1. In particular, q = 1 when p = ∞. Remark. According to equation 4, the speed s(x i , t 0 ) is positive at time t 0 when ∇ θ ϕ yi (x i (t 0 ), θ(t 0 )) • θ ′ (t 0 ) > 0, i.e., ϕ yi (x i (t), θ(t)) increases at time t 0 , meaning that the boundary point xi (t 0 ) will be correctly classified after the update. Also, the magnitude of the speed tends to be larger if ∥∇ x ϕ yi (x i (t), θ(t))∥ q is smaller, i.e., the margin function ϕ yi is flatter around xi (t). In the remaining part of the paper, we will denote s(x i , t) and R(x i , t) as s(x i ) and R(x i ) when the indication is clear. Computing the closest boundary point. We use the Fast Adaptive Boundary Attack (FAB) (Croce & Hein, 2020a) to compute xi (t) in equation 4. Specifically, FAB iteratively projects onto the linearly approximated decision boundary with a bias towards the original data point, so that the resulting boundary point is close to the original point x i . Note that FAB only serves as an algorithm to find xi (t), and can be decoupled from the remaining part of the framework. In our experiments we find that FAB can reliably find locally closest boundary points given enough iterations, where the speed expression in equation 4 is still valid. We give more details of how to check the local optimality condition of equation 3 and the performance of FAB in Appendix C.1. Note that in Section 5.2, we will see that directly using FAB is computationally prohibitive for robust training and we will propose a more efficient solution. In the next section, we will still use FAB to find closest boundary points for more accurate visualization of the dynamics during adversarial training.

4.2. DYNAMICS OF ADVERSARIAL TRAINING

In this section, we numerically investigate the dynamics of the decision boundary during adversarial training. We visualize the speed and identify the conflicting dynamics of the decision boundary. Experiment setting. To study the dynamics of AT in different stages of training where models have different levels of robustness, we train a ResNet-18 (He et al., 2016a) with group normalization (GN) (Wu & He, 2018) on CIFAR-10 using 10-step PGD under ℓ ∞ perturbation with ϵ = 8 255 from two pretrained models: (1) a partially trained model using natural training with 85% clean accuracy and 0% robust accuracy; (2) a partially trained model using AT with 75% clean accuracy and 42% robust accuracy under 20-step PGD attack. Note that we replace the batch normalization (BN) layers with GN layers since the decision boundaries are not the same during training and evaluation when BN is used, which can cause confusion when studying the dynamics of the decision boundary. On both pretrained models, we run one iteration of AT on a batch of training data. For correctly classified points in the batch of data, we compute the margins as well as the speed of the decision boundary. Conflicting dynamics. The dynamics of the decision boundary on both pretrained models under AT is shown in Figure 3 . The speed values are normalized so that the maximum absolute value is 1 for better visualization of their relative magnitude. We can observe that on both pretrained models, the decision boundary has negative speed w.r.t. a significant proportion of non-robust points with R(x i ) < 8 255 . That is, the margins of many vulnerable points decrease during adversarial training even though the current update of the model is computed on these points, which counteracts the objective of robust training. In the next section, we will design a dynamics-aware robust training method to mitigate such conflicting dynamics issue.

5. DYART: DYNAMICS-AWARE ROBUST TRAINING

In this section, we propose Dynamics-Aware Robust Training (DyART) to mitigate the conflicting dynamics issue. In Section 5.1, we show how to design an objective function to prioritize improving smaller margins and how to compute the gradient of such objective. In Section 5.2, we overcome the expensive cost of finding the closest boundary points and present the full DyART algorithm.

5.1. OBJECTIVE FOR DESIRABLE DYNAMICS

We aim to design a loss function L R (θ) to directly increase the overall margins for effective robustness improvement. We propose to use the robustness loss L R (θ) := E x [h(R θ (x))], where h : R → R is a carefully selected cost function that assigns a cost value h(R) to a margin R. When designing h(•), it is crucial that minimizing L R (θ) = E x [h(R θ (x))] encourages the desirable dynamics of the decision boundary: the decision boundary has positive speed w.r.t. vulnerable points with small margins. Dynamics-aware loss function. To design such a dynamics-aware loss function, the following two properties of the cost function h are desired. (1) Decreasing (i.e., h ′ (•) < 0): a point with a smaller margin should be assigned a higher cost value since it is more vulnerable. (2) Convex (i.e., h ′′ (•) > 0): the convexity condition helps prioritize improving smaller margins. To see this, consider minimizing the loss function L R (θ) on m points {x i , y i } m i=1 with margins {R i } m i=1 , where the objective becomes 1 m m i=1 h(R θ (x i )). At each iteration, the optimizer should update the model to decrease the objective value. Therefore, in continuous-time we have that d dt m i=1 h(R(x i , t)) < 0. Using the chain rule and the definition that the speed s(x i , t) = d dt R(x i , t), we obtain that m i=1 h ′ (R i )s(x i , t) < 0. Given that h ′ (•) < 0, the ideal case is that s(x i , t) > 0 for all x i and thus the sum m i=1 h ′ (R i )s(x i , t) < 0. In this case, the margins of all data points increase. However, due to the existence of conflicting dynamics as described in Section 4.2, some points may have negative speed s(x i , t) < 0 while m i=1 h ′ (R i )s(x i , t) stays negative. In the presence of such conflicting dynamics, if |h ′ (R i )| is large (i.e., h ′ (R i ) is small since h ′ (•) < 0), it is more likely that s(x i , t) > 0 since otherwise it is harder to make m i=1 h ′ (R i )s(x i , t) negative. When h ′′ (•) > 0, a smaller margin R i has smaller h ′ (R i ) and thus s(x i , t) tends to be positive. Therefore, requiring h ′′ (•) > 0 incentivizes the decision boundary to have positive speed w.r.t. points with smaller margins. How to design the optimal h(•) is still an open problem. In this paper, we propose to use h(R) = 1 α exp(-αR), R < r 0 0, otherwise where the hyperparameters α > 0 and r 0 > 0. Larger α values prioritize improving smaller margins. The threshold r 0 is used to avoid training on points that are too far away from the clean data points. Difficulties of computing margin gradient. Directly minimizing E x [h(R θ (x))] through gradientbased optimization methods requires computing the gradient ∇ θ h(R θ (x i )) w.r.t. the model parameters. However, it was previously unclear how to compute ∇ θ h(R θ (x i )), which partly explains why previous works did not directly operate on the margins. The difficulty of computing ∇ θ h(R θ (x i )) lies in the fact that R θ (x i ), as defined in equation 2, involves a constrained optimization problem and thus its gradient ∇ θ R θ (x i ) cannot be computed straightforwardly. An additional challenge is dealing with the non-smoothness of the ℓ ∞ norm, which is widely used in the robust training literature. Our solution. We overcome the above challenges and provide the following close-form expression for the gradient of any smooth function of the margin. The proof is provided in Appendix B. Theorem 4 (The gradient ∇ θ h(R θ (x i )) of any smooth function of the margin). For 1 ≤ p ≤ ∞, ∇ θ h(R θ (x i )) = h ′ (R θ (x i )) ∥∇ x ϕ yi (x i , θ)∥ q ∇ θ ϕ yi (x i , θ) where q satisfies that 1/q + 1/p = 1. In particular, q = 1 when p = ∞. Note that another expression for the margin gradient (i.e., h is the identity function in equation 6) was derived in MMA (Ding et al., 2020) , with the following distinctions from us: (a) The expression in MMA does not apply to the ℓ ∞ norm while ours does. (b) The coefficient 1 ∥∇xϕ y i (xi,θ)∥q in our expression is more informative and simpler to compute. (c) MMA treats the aforementioned coefficient as a constant during training, and therefore does not properly follow the margin gradient. Computing ∇ θ h(R θ (x i )) requires computing the closest boundary points xi , which can be computationally prohibitive for robust training. In the next section, we propose to use the closest point xsoft i on the soft decision boundary instead, whose quality of approximation to the exact decision boundary is controllable and computational cost is tractable. We will then present the full DyART algorithm.

5.2. EFFICIENT ROBUST TRAINING

Directly finding the closest boundary points is expensive. Since the closest boundary point xi can be on the decision boundary between the true class and any other class, FAB needs to form a linear approximation of the decision boundary between the true class and every other class at each iteration. This requires computing the Jacobian of the classifier, and the computational cost scales linearly with the number of classes K (Croce & Hein, 2020b) . Therefore, finding the closest points on the exact decision boundary is computationally prohibitive for robust training in multi-class classification settings, especially when K is large. To remedy this, we propose to instead use the closest points on the soft decision boundary as elaborated below. Soft decision boundary. We replace the maximum operator in logit margin (equation 1) with a smoothed maximum controlled by the temperature β > 0. Specifically, we define the soft logit margin of the class y as Φ y θ (x; β) = z y θ (x) - 1 β log y ′ ̸ =y exp(βz y ′ θ (x)) The soft decision boundary is defined as the zero level set of the soft logit margin: Γ soft y = {x : Φ y θ (x; β) = 0}. For x i with Φ yi θ (x i ; β) > 0, the closest soft boundary point is defined as xsoft i = arg min x ∥x -x i ∥ p s.t. Φ y θ (x; β) = 0, and the soft margin is defined as R soft θ (x i ) = ∥x soft i -x i ∥ p . Note that we do not define R soft (x i ) when Φ yi θ (x i ; β) < 0. The relationship between the exact and soft decision boundary is characterized by the following proposition: Proposition 5. If x is on the soft decision boundary Γ soft y , i.e. Φ y θ (x; β) = 0, then log(K-1) β ≥ ϕ y θ (x) ≥ 0. Moreover, when Φ yi θ (x i ; β) > 0, then R soft θ (x i ) ≤ R θ (x i ). In other words, the soft decision boundary is always closer to x i than the exact decision boundary as shown in Figure 4 . Moreover, the quality of approximation to the exact decision boundary is controllable: the gap between the two decreases as β increases and vanishes when β → ∞. Therefore, increasing the soft margins will increase the exact margins as well. (2) Effective information usage. Another benefit of using the smoothed max operator in soft logit margin is that, unlike the logit margin ϕ yi θ (x i ), the soft logit margin Φ yi θ (x i ; β) contains information of logit values of all classes. Therefore, the information of all classes is used at each iteration when finding xsoft i . Loss function and its gradient. The overall objective of DyART is to increase the soft margins and also achieve high clean accuracy. Denote a training data batch B of size n and B + θ of size m to be {i ∈ B : Φ yi θ (x i ; β) > 0}. Our proposed method DyART uses the following loss function L θ (B) = 1 n i∈B l(x i , y i ) + λ n i∈B + θ h(R soft θ (x i )) where the first term is the average cross-entropy loss on natural data points and the second term is for increasing the soft margins. The hyperparameter λ balances the trade-off between clean and robust accuracy. By applying equation 6, the gradient of the objective can be computed as ∇ θ L θ (B) = 1 n i∈B ∇ θ l(x i , y i ) + λ n i∈B + θ h ′ (R soft θ (x i )) ∥∇ x Φ yi θ (x soft i ; β)∥ q ∇ θ Φ yi θ (x soft i ; β) Since the soft margin R soft θ (x i ) is only defined for x i with Φ yi θ (x i ; β) > 0, DyART requires training on a pretrained model with a relatively high proportion of points with positive Φ yi θ values. In practice, we find that a burn-in period of several epochs of natural training is enough for such pretrained model. Novelty compared with prior works. (1) Direct and efficient manipulation of the margin. (1a) In contrast to prior works that depend on indirect approximations of margins, DyART directly operates on margins by utilizing the closed-from expression for the margin gradient in equation 6 whose computation was previously unclear. (1b) We significantly reduce the computational cost of computing margins and its gradients by introducing the soft margin, a lower bound of the margin whose approximation gap is controllable. (2) Prioritizing the growth of smaller margins by carefully designing the cost function h(•) to mitigate the conflicting dynamics. Therefore, DyART achieves more targeted and effective robustness improvement by directly and efficiently operating on margins as well as prioritizing the growth of smaller margins.

6. EXPERIMENTS

In this section, we empirically evaluate the effectiveness and performance of the proposed DyART on the CIFAR-10 ( Krizhevsky et al., 2009) and Tiny-ImageNet (Deng et al., 2009) datasets. In Section 6.1, we evaluate the adversarial robustness of DyART and compare it with several state-ofthe-art baselines. In Section 6.2, we visualize the dynamics of the decision boundary under DyART and analyze how it alleviates the conflicting dynamics.

6.1. ROBUSTNESS EVALUATION

Architectures and training parameters. In the experiments on the CIFAR-10 dataset, we use the Wide Residual Network (Zagoruyko & Komodakis, 2016) with depth 28 and width factor 10 (WRN-28-10). On the Tiny-ImageNet dataset, we use pre-activation ResNet-18 (He et al., 2016b) . Models are trained using stochastic gradient descent with momentum 0.9 and weight decay 0.0005 with batch size 256 for 200 epochs on CIFAR-10 and 100 epochs on Tiny-ImageNet. We use stochastic weight averaging (Izmailov et al., 2018) with a decay rate of 0.995 as in prior work (Gowal et al., 2020) . We use a cosine learning rate schedule (Loshchilov & Hutter, 2016) without restarts where the initial learning rate is set to 0.1 for all baselines and DyART. To alleviate robust overfitting (Rice et al., 2020) , we perform early stopping on a validation set of size 1024 using projected gradient descent (PGD) attacks with 20 steps. Baselines. On CIFAR-10, the baselines include: (1) standard adversarial training (AT) (Madry et al., 2017) which trains on the worst case adversarial examples; (2) TRADES (Zhang et al., 2019) which trades off between the clean and robust accuracy; (3) MMA (Ding et al., 2020) which uses cross-entropy loss on the closest boundary points; (4) GAIRAT (Zhang et al., 2021) which reweights adversarial examples based on the least perturbation iterations. ( 5) MAIL (Liu et al., 2021) which reweights adversarial examples based on their logit margins. ( 6) AWP (Wu et al., 2020) which adversarially perturbs both inputs and model parameters. On Tiny-ImageNet, we compare with AT, TRADES, and MART whose hyperparameter settings are available for this dataset. The hyperparameters of the baselines and full experimental settings are found in Appendix D.1. Evaluation details. We evaluate DyART and the baselines under ℓ ∞ norm constrained perturbations. The final robust accuracy is reported on AutoAttack (AA) (Croce & Hein, 2020b) . For all methods, we choose the hyperparameters to achieve the best robust accuracy under the commonly used perturbation bound ϵ = 8 255 . To fully compare the robustness performance among different methods, we report the robust accuracy under four additional perturbation bounds: 2 255 , 4 255 , 12 255 and 16 255 . Hyperparameters of DyART. We use the cost function h(•) in equation 5. On CIFAR-10, we use α = 3, r 0 = 16 255 , λ = 1000 and apply gradient clipping with threshold 0.1. On Tiny-ImageNet, we use α = 5, r 0 = 32 255 , λ = 500 and apply gradient clipping with threshold 1. The temperature β is set to 5. We use 20 iterations to find the closest soft boundary points using the adapted version of FAB. We use 10 epochs of natural training as the burn-in period. Performance. The evaluation results on CIFAR-10 and Tiny-ImageNet are shown in Table 1 and Table 2 , respectively. On CIFAR-10, under three out of five perturbation bounds, DyART achieves the best robustness performance among all baselines. On Tiny-ImageNet, DyART obtains the highest robust accuracy under all perturbation bounds and also achieves the highest clean accuracy. These results indicate the superiority of DyART in increasing the margins. (1) Specifically, on CIFAR-10, DyART achieves the highest robust accuracy under ϵ = 2 255 , 4 255 and 8 255 , and achieves the second highest robust accuracy under ϵ = 12 255 and 16 255 , which is lower than TRADES. (1a) Since DyART prioritizes increasing smaller margins which are more important, DyART performs better than TRADES under smaller perturbation bounds and achieves much higher clean accuracy. (1b) Although GAIRAT and AT have higher clean accuracy than DyART, their robustness performance is lower than DyART under all perturbation bounds. (1c) Thanks to directly operating on margins in the input space and encourage robustness improvement on points with smaller margins, DyART performs better than GAIRAT and MAIL-TRADES which use indirect approximations of the margins. (2) On Tiny-ImageNet, DyART achieves the best clean accuracy and the best robust accuracy under all perturbation bounds. Further experimental results on both datasets using various hyperparameter settings and types of normalization layers are left to Appendix D.2. We also provide results of DyART when trained with additional data from generated models (Rebuffi et al., 2021) DyART mitigates the conflicting dynamics. We visualize the dynamics on both pretrained models under DyART and AT in Figure 5 . Specifically, we divide the range of margins into multiple intervals and compute the proportion of positive and negative speed within all the correctly classified points. On the naturally pretrained model, most of the points have margins less than 4 255 (the first bin) and are considered more vulnerable. Among these points, DyART reduces the proportion of the negative speed from 29.2% to 15.3% when comparing with AT. Therefore, a higher percentage of the margins of vulnerable points will increase using DyART. On the adversarially pretrained model, DyART reduces the proportion of negative speed values in the first three margin intervals and therefore has better dynamics of the decision boundary. We conclude that compared with AT, DyART leads to better dynamics of the decision boundary where increasing smaller margins is prioritized. 

A ADDITIONAL RELATED WORK

Decision boundary analysis In this paper, we mathematically characterizes the dynamics of decision boundaries and provide methods to directly compute and control the dynamics. Prior to this work, there are also some interesting studies on the dynamics of margins, though from different perspectives. Rade & Moosavi-Dezfooli (2022) Other Approaches to Improve Adversarial Training. Recent works (Najafi et al., 2019; Rebuffi et al., 2021; Gowal et al., 2021; 2020) have shown that the robust accuracy of adversarial training can be improved significantly with additional data from unlabeled datasets, data augmentation techniques and generative models. These approaches enhance the robustness of models by augmenting the dataset, which is orthogonal to our proposed algorithm that focus on how to optimize the model with the original dataset. Wu et al. (2020) show that model robustness is related to the flatness of weight loss landscape, which is implicitly achieved by commonly used adversarial learning techniques. Based on this insight, the authors propose to explicitly regularize the flatness of the weight loss landscape, which can improve the robust accuracy of existing adversarial training methods. Cui et al. (2021) propose to use logits from a clean model to guide the learning of a robust model, which leads to both high natural accuracy and strong robustness. We note that our method focuses on a different perspective of adversarial training, i.e., dynamics of decision boundary, and can be combined with these techniques to further improve the robust accuracy of the model. The investigation of such combination is out of the scope of this paper, and will be addressed in our future work. Certifiable Robustness. There is an important line of work studying guaranteed robustness of neural networks. For example, convex relaxation of neural networks (Gowal et al., 2019; Zhang et al., 2018; Wong & Kolter, 2018; Zhang et al., 2020a; Gowal et al., 2018) bounds the output of a network while the input data is perturbed within an ℓ p norm ball. Randomized smoothing (Cohen et al., 2019) is another certifiable defense which adds Gaussian noise to the input during test time. Croce et al. (2019) propose a provably robust regularization for ReLU networks that maximizes the linear regions of the classifier and the distance to the decision boundary. Note that certifiable robust radius is a strict lower bound of the margin, which is the focus of our work.

B PROOF OF THE CLOSED-FORM EXPRESSION FOR THE SPEED IN EQUATION 4 AND THE MARGIN GRADIENT IN EQUATION 6

In this section, our goal is to prove the closed-form expression equation 4 as well as the margin gradient in equation 6 and provide further discussions. We first provide two preliminary lemmas and present the mathematical assumptions. Then we rigorously derive the closed-form expressions. Finally, we discuss more about the expression and its assumptions. Lemma 6. For 1 ≤ p ≤ ∞ and let q satisfies 1/q + 1/p = 1. Let a be any fixed vector. Then ∇ x ∥x -a∥ p q = 1 Proof. Without loss of generality, assume a is the zero vector. Write the k-th component of x as x k . Case 1: 1 ≤ p < ∞ By calculation, ∂∥x∥p ∂x k = ( |x k | ∥x∥p ) p-1 • sign(x k ). Since q = p p-1 , we have that k | ∂∥x∥ p ∂x k | q = k |( |x k | ∥x∥ p ) p-1 • sign(x k )| p p-1 = k |x k | p ∥x∥ p p = 1 Therefore, ∇ x ∥x∥ p q = ( k | ∂∥x∥p ∂x k | q ) 1/q = 1. Case 2: p = ∞ In this case, ∇ x ∥x∥ ∞ is a one-hot vector (with the one being the position of the element of x with the largest absolute value). Therefore, ∥∇ x ∥x∥ ∞ ∥ 1 = 1. The following lemma deals with the optimality condition for p = ∞. Special care needs to be taken since L ∞ norm is not a differentiable function. Lemma 7. Let x be a local optimum of the constrained optimization problem: x = arg min z ∥x -a∥ ∞ s.t. ϕ(x) = 0, where a is any fixed vector with ϕ(a) > 0. Assume that ϕ is differentiable at point x. Denote the coordinates set J = {j : |x j -a j | = ∥x -a∥ ∞ }. Denote the k-th component of ∇ x ϕ(x) as ∇ x ϕ(x) k . Then (a) for j ∈ J , ∇ x ϕ(x) j and xj -a j have opposite signs; (b) for k / ∈ J , ∇ x ϕ(x) k = 0. Remark. If ϕ(a) < 0, then (a) for j ∈ J , ∇ x ϕ(x) j and xj -a j have the same sign; (b) for k / ∈ J , ∇ x ϕ(x) k = 0. Proof. (a) Consider the perturbation x(ϵ) = x + (0, • • • , ϵ j1 , • • • , ϵ jm , • • • , 0) where J = {j 1 , • • • , j m } and ϵ is a m dimensional vector with j-th component ϵ j . Since ϕ(a) > 0 and x is a local optimum, ∥x -a∥ ∞ < ∥x -a∥ ∞ imply ϕ(x) > 0 if x is sufficiently close to x. Therefore, if every ϵ ji is chosen so that |x ji + ϵ ji -a ji | < |x ji -a ji | (that is, ϵ ji has different sign from xji -a ji ) and ∥ϵ∥ sufficiently small, then ∥x(ϵ) -a∥ < ∥x -a∥ and thus ϕ(x(ϵ)) > 0. On the other hand, by Taylor expansion and the fact that ϕ(x) = 0, we have that ϕ(x(ϵ)) = j∈J ∇ϕ(x) j ϵ j + O(∥ϵ∥ 2 ) Therefore, j∈J ∇ϕ(x) j ϵ j > 0 for any such ϵ. By taking other ϵ k → 0 if necessary, we obtain that ∀j ∈ J , ∇ϕ(x) j ϵ j ≥ 0, where ϵ j has different sign from xj -a j . Therefore, ∇ϕ(x) j and xj -a j have different signs. (b) Take any k / ∈ J and consider the perturbation x(ϵ) = x + (0, • • • , ϵ j1 , • • • , ϵ k , • • • , ϵ j k , • • • , 0) where ϵ = (ϵ j1 , • • • , ϵ k , • • • , ϵ j k ). Choose any ϵ so that ∥ϵ∥ is sufficiently small, each ϵ ji has the opposite sign of xji -a ji and ϵ k small enough (which can be positive or negative), we have that ϕ(x(ϵ)) > 0 since ∥x(ϵ)-a∥ ∞ < ∥x-a∥ ∞ . By Taylor expansion, j∈J ∇ϕ(x) j ϵ j +ϵ k ∇ x ϕ(x) k > 0 for any such ϵ. By taking ϵ j → 0 and using the fact that ϵ k can be positive or negative, we conclude that ∇ x ϕ(x) k = 0. Now we are ready to derive the closed-form expression of the speed. We first provide the full assumptions, then derive the expression, and finally we will discuss more about the assumptions. We will write xi (t) as xi when the indication is clear. Assumption 8. Suppose that (x i , y i ) is correctly classified by f θ(t) in some time interval t ∈ I and xi (t) is a locally closest boundary point in the sense that for any t ∈ I, it is the local optimum of the following: xi (t) = arg min x ∥x -x i ∥ p s.t. ϕ yi (x, θ(t)) = 0. Assume that in the time interval I: (a) xi (t) is differentiable in t; (b) ϕ yi is differentiable at point xi (t) and at the current parameter θ(t). Proposition (Closed-form expression of the speed s(x i , t)). For 1 ≤ p ≤ ∞ and under Assumption 8, define the (local) speed according to xi (t) in Assumption 8 as s(x i , t) = d dt ∥x i (t) -x i ∥ p , we have the following: s(x i , t) = 1 ∥∇ x ϕ yi (x i (t), θ(t))∥ q ∇ θ ϕ yi (x i (t), θ(t)) • θ ′ (t) where q satisfies that 1/q + 1/p = 1. In particular, q = 2 when p = 2 and q = 1 when p = ∞. Proof. Case 1: 1 ≤ p < ∞ To compute s(x i , t) = d dt ∥x i (t) -x i ∥ 2 , we need to characterize the curve of the closest boundary point xi (t), where two key points stand out. First, xi (t) is on the decision boundary Γ y (t) and thus ϕ y (x i (t), θ(t)) = 0 for all t ∈ I. By taking the time derivative on both sides, we obtain the level set equation (Osher et al., 2004; Aghasi et al., 2011 ) ∇ x ϕ yi (x i (t), θ(t)) • x′ i (t) + ∇ θ ϕ yi (x i (t), θ(t)) • θ ′ (t) = 0 (10) Second, xi (t) is the optimal solution of constrained optimization equation 3. Therefore, we have the following optimality condition: ∇ x ϕ yi (x i (t), θ(t)) + λ(t)∇ x ∥x i (t) -x i ∥ p = 0 (11) Since x i is correctly classified, ϕ yi (x i ) > 0. Since xi (t) is the closest point to x i whose ϕ yi value is zero, λ(t) > 0. By taking the L q norm in Equation equation 11 and using Lemma 6, we obtain that λ(t) = ∥∇ x ϕ yi (x i (t), θ(t))∥ q . Now, we derive s(x i , t) as follows: s(x i , t) = d dt ∥x i (t) -x i ∥ p = ∇ x ∥x i (t) -x i ∥ p • x′ i (t) = - 1 λ ∇ x ϕ yi (x i (t), θ(t)) • x′ i (t) (By the optimality condition equation 11) = 1 λ ∇ θ ϕ yi (x i (t), θ(t)) • θ ′ (t) (By the level set equation equation 10) = ∇ θ ϕ yi (x i (t), θ(t)) • θ ′ (t) ∥∇ x ϕ y (x i (t), θ(t))∥ q Case 2: p = ∞ Note that since L ∞ is not differentiable, the optimality condition in Equation equation 11 does not hold anymore. Denote the j-th component of xi (t) and x i as xij (t) and x ij . Let J = {j : |x ij (t) -x ij | = ∥x i (t) -x i ∥ ∞ }. By Lemma 7, s(x i , t) = d dt |x ij (t) -x ij | = x′ ij (t) sign(x ij (t) -x ij ) = -x ′ ij (t) sign(∇ x ϕ yi (x i ) j ) for all j ∈ J . Therefore, by Equation equation 10 and Lemma 7 -∇ θ ϕ yi (x i (t), θ(t)) • θ ′ (t) = ∇ x ϕ yi (x) • x′ i (t) = j∈J ∇ x ϕ yi (x i ) j • x′ ij (t) = j∈J -∇ x ϕ yi (x i ) j • s(x i , t) sign(∇ x ϕ y (x i ) j ) = - j∈J |∇ x ϕ yi (x i ) j | • s(x i , t) Therefore, s(x i , t) = ∇ θ ϕ y i (xi(t),θ(t))•θ ′ (t) j∈J |∇xϕ y i (xi)j | = ∇ θ ϕ y i (xi(t),θ(t))•θ ′ (t) ∥∇xϕ y i (xi)∥1 , where the last equality follows from Lemma 7 that the components of ∇ x ϕ yi (x i ) are zeros if they are not in J . As an corollary of the proposition we prove above, we can obtain the closed-form expression for the gradient of margin (or the gradient of any smooth function of the margin) as follows: Theorem (Closed-form expression of ∇ θ h(R θ (x i ))). For 1 ≤ p ≤ ∞, ∇ θ h(R θ (x i )) = h ′ (R θ (x i )) ∥∇ x ϕ yi (x i , θ)∥ q ∇ θ ϕ yi (x i , θ) where q satisfies that 1/q + 1/p = 1. Proof. In continuous time we consider h(R(x i , t)) (or more rigorously, h(R(x i , θ(t)))) and its time derivative. We use the following relationship between the gradient and the time derivative, where θ ′ (t) can be any update rule: d dt h(R(x i , t)) = ∇ θ h(R(x i , t)) • θ ′ (t) On the other hand: d dt h(R(x i , t)) = h ′ (R(x i , t))) d dt R(x i , t) = h ′ (R(x i , t))s(x i , t) = h ′ (R(x i )) ∥∇ x ϕ yi (x i (t), θ(t))∥ q ∇ θ ϕ yi (x i (t), θ(t)) • θ ′ (t) where the last equality uses the closed-form expression for the speed s(x i , t). Therefore we have that for any θ ′ (t), ∇ θ h(R(x i , t)) • θ ′ (t) = h ′ (R(xi)) ∥∇xϕ y i (xi(t),θ(t))∥q ∇ θ ϕ yi (x i (t), θ(t)) • θ ′ (t). We conclude that ∇ θ h(R θ (x i )) = h ′ (R θ (xi)) ∥∇xϕ y i (xi,θ)∥q ∇ θ ϕ yi (x i , θ).

DISCUSSIONS ON THE ASSUMPTIONS

Assumption 8 has several points that need to be explained further. 

C COMPUTATION OF THE EXACT AND SOFT CLOSEST BOUNDARY POINT

Either computing the speed of the decision boundary or using DyART to directly optimize a function of margins requires the computation of the closest boundary point x (or the closest soft boundary point xsoft ), where we omit the subscript i in this section. As discussed in Appendix B, it suffices to find the locally closest (soft) boundary point in order for the closed-form expression 4 and expression 6 for the speed and the gradient of margin to be valid.

C.1 CLOSEST BOUNDARY POINT

In this section, we will explain how to check the quality of the found x for the constrained optimization problem 3 in practice. We will also give a simple analysis on how FAB (Croce & Hein, 2020a) , the algorithm we use in our implementation, solves the problem 3 in practice. We include both p = 2 and p = ∞ although in our work, only p = ∞ is used. We discuss both of them in order to highlight the difference in checking optimality conditions for smooth (p = 2) and non-smooth norm (p = ∞). The key points of analyzing x are that ϕ y (x) = 0 and the KKT conditions of problem 3. Case 1: p = 2 In this case, the KKT condition is given by ∇ x ϕ y (x) + λ(x -x) = 0 for some λ > 0 (since ϕ y (x) > 0). In other words, ∇xϕ y (x) ∥∇xϕ y (x)∥ • x-x ∥x-x∥ = 1. In practice, we check the following two conditions (a) |ϕ(x)| ≤ 0.1; (b) ∇xϕ y (x) ∥∇xϕ y (x)∥ • x-x ∥x-x∥ > 0. 8. We observe in our experiments that FAB can find high-quality closest boundary points for over 90% of the correctly classified data points. > 0.8. Note the unlike p = 2, the optimality conditions for p = ∞ are on each coordinate of x, which is more difficult to satisfy in practice. We observe in our experiments that FAB with 100 iterations can find high-quality closest boundary points for about 85% of the correctly classified points. However, when only 20 iterations are used, condition (3) is barely satisfied for all of the found boundary points (the first two conditions are still satisfied). In our visualizations of dynamics of the decision boundary for AT in Section 4.2, we use 100 iterations for FAB and only use high-quality closest boundary points, so that the visualization results are relatively accurate.

C.2 CLOSEST SOFT BOUNDARY POINT

Adapt FAB for soft decision boundary. In DyART, the closest point xsoft on the soft decision boundary is used. To find xsoft , we adapt the FAB method. The original FAB method aims to find the closest point on the exact decision boundary. In particular, FAB forms linear approximations for decision boundary between the ground truth class and every other classes. The only adaptation we do on the FAB method is that now FAB only forms one linear approximation for the soft decision boundary of the ground truth class. This is because we use the smoothed max operator in the soft logit margin, and there is no concept of the 'decision boundary between the ground truth class and another class' anymore. Computational efficiency. By using the soft decision boundary, every iteration of FAB only requires one linear approximation of the soft decision boundary, which cost one back-propagation. In contrast, the original FAB which aims to find the closest boundary point on the exact decision boundary costs K back-propagation at each iteration, where K is the number of classes. Therefore, using the soft decision boundary is more efficient and is used in our proposed robust training method DyART. Local optimality condition. The procedure of checking optimality condition is similar to the one in the last section. Denote #B the number of points in a set B. Using the notation J = {j : |x j -x j | = ∥x -x∥ ∞ } and J C the complement set of J , we check the following conditions in practice: (a) |ϕ(x)| ≤ 0.1; (b) #{j∈J :∇xϕ y (x)j (xj -xj )≤0} #J > 0.9; We find that when the temperature β is relatively large (we use β = 5 in all of our experiments) and 20 iterations is used, 95% of the soft boundary point found for the correctly classified points satisfy these two conditions. During training, we only use these higher quality points and discard the rest of the boundary points that do not satisfy these two conditions. Note that we do not consider the third condition (c) #{k / ∈J:|∇xϕ y (x) k |<0.1} #J C > 0.8. This is because condition (c) cannot be satisfied unless a very large iteration number is used, which is computationally prohibitive for robust training. Experimentally DyART achieves improved robustness over baseline methods, indicating that the closest soft boundary points used by DyART are indeed useful for robust training. Designing faster and more reliable methods to solve the constrained optimization problem 3 is left for future work. size 0.007 using PGD-10 and the 'tanh' weight assignment function is used. (5) MAILfoot_0 (Liu et al., 2021) which reweights adversarial examples using margin value. We choose its combination with TRADES (MAIL-TRADES) which provides better robustness performance than combining with AT (MAIL-AT). Its hyperparamters beta, bias and slope are set to 5.0, -1.5 and 1.0, respectively. (6) AWPfoot_1 (Wu et al., 2020) which adversarially perturbs both inputs and model parameters. (7) FATfoot_2 (Zhang et al., 2020b ) that exploits friendly adversarial data, where the perturbation bound is set tofoot_3 255 . ( 8) MART (Wang et al., 2019) which explicitly differentiates the mis-classified and correctly classified examples. On Tiny-ImageNet, we compare with AT, TRADES, and MART whose hyperparameter settings are available for this dataset. We follow the PyTorch implementation of (Gowal et al., 2020; Rebuffi et al., 2021) 8 for AT, TRADES and MART for both datasets. Evaluation details. We evaluate DyART and the baselines under ℓ ∞ norm constrained perturbations. The final robust accuracy is reported on AutoAttack (AA) (Croce & Hein, 2020b) , which uses an ensemble of selected strong attacks. For all methods, we choose the hyperparameters to achieve the best robust accuracy under the commonly used perturbation bound ϵ = 8 255 . To fully compare the robustness performance among different methods, we report the robust accuracy under four additional perturbation bounds: 2 255 , 4 255 , 12 255 and 16 255 . Per-sample gradient For computing the speed of the decision boundary in Section 4.2 and Section 6.2, we need to compute the per-sample gradient ∇ θ ϕ yi (x i , θ) for every correctly classified point x i . We use the Opacus package (Yousefpour et al., 2021) for computing per-sample gradients in parallel. Also, another reason why we replace BN with GN is because Opacus does not support BN for computing per-sample gradients. Although using this package will increase the memory usage, it is worth mentioning that during robust training, DyART does not need to compute the per-sample gradient and thus does not have the excessive memory issue. Per-sample gradients are only collected for computing speed, which is for interpretation of dynamics of different methods and not for robust training.

D.2 HYPERPARAMETER SENSITIVITY EXPERIMENTS

In this section, we present the robustness performance of DyART under different hyperparameter settings. We first show the results for architectures using Group Normalization (note that in Section 6.1 we use the original architectures using Batch Normalization) and analyze the effect of different hyperparameters. We then demonstrate more ablation experiments for architectures using Batch Normalization used in Section 6.1. Overall performance of DyART on architectures with GN In Table 3 and Table 4 highly adversarial data in order to keep clean accuracy high, it achieves the best clean accuracy and robustness under a very small perturbation bound 2 255 . However, its performance on larger perturbation bounds is inadequate. (1b) Thanks to directly operating on margins in the input space and encourage robustness improvement on points with smaller margins, DyART performs better than GAIRAT and MAIL-TRADES which use indirect approximations of the margins. (2) On Tiny-ImageNet, DyART achieves the best clean accuracy and the best robust accuracy under all perturbation bounds except the largest 16 255 . Although MART is the most robust under 16 255 , it has much lower clean accuracy (8.93% lower than DyART) and worse robustness under smaller perturbation bounds.

Hyperparameters of DyART

In this paper, we use the cost function of the form h(R) = 1 α exp(-αR) when R < r 0 and h(R) = 0 otherwise. We present results under different decay strengh α > 0, margin threshold r 0 as well as regularization constant λ for the robustness loss. Performance results. The evaluation results on CIFAR-10 and Tiny-ImageNet with Group Normalization are shown in Table 5 and Table 6 , respectively. We analyze the effects of hyperparameters as follows. (1) Effect of α: Larger α corresponds to a cost function h(•) that decays faster, and therefore prioritize improvement on even smaller margins. Therefore, it should be expected that larger α leads to higher clean accuracy and higher robust accuracy under smaller perturbation sizes, and results in lower robust accuracy under larger perturbation sizes. For example, on CIFAR-10, when α = 5 is increased to α = 8 when r 0 = 16 255 , λ = 400, the clean accuracy as well as the robust accuracy under ϵ = 2 255 , 4 255 and 8 255 improves, while the robust accuracy under larger ϵ gets lower. The same patterns can also be observed on Tiny-ImageNet, for example, when α = 8 is increased to α = 10 when r 0 = 20 255 , λ = 1000. (2) Effect of r 0 : r 0 is from preventing DyART from training boundary points that are too far away from clean data points. Therefore, it should be expected that training on smaller r 0 tends to increase the clean accuracy and the robust accuracy under relatively small perturbation sizes. Indeed, on Tiny-ImageNet, when r 0 = 24 255 is decreased to r 0 = 20 255 when α = 10 and λ = 1000, we can observe that the clean accuracy as well as robust accuracy under ϵ = 2 255 increases but the robust accuracy under larger perturbation sizes ϵ = 12 255 and 16 255 decreases. (3): Effect of robust loss constant λ: A larger λ tends to increase the robustness of the model (in particular, the robust accuracy under relatively larger perturbation sizes) while decrease the clean accuracy and the robust accuracy under relatively small perturbation sizes. For example, on Tiny-ImageNet, when λ = 800 is increased to λ = 1000 when α = 10 and r 0 = 20 255 , the clean accuracy and robust accuracy under relatively small ϵ = 2 255 , 4 255 drops but the robust accuracy under larger perturbation sizes increase. (4) Effect of the burn-in period: A burn-in period of natural training is necessary for DyART since its robust loss function depends on the closest boundary points, which can only be found on correctly classified points. That is, DyART requires a descent initial clean accuracy. In our experiments, we find that the learning rate of the burn-in period is important: DyART will train successfully if the learning rate of the burn-in period is relatively large (e.g. 0.1 for CIFAR-10 and Tiny-ImageNet). However, when the learning rate is small (such as 0.001), DyART sometimes drives the clean accuracy to be very low at first, and fails to train. Our suggestion is to use a larger learning rate to obtain a naturally pretrained model. 

D.4 FURTHER ANALYSIS ON DECISION BOUNDARY DYNAMICS

In Section 4.2 and Section 6.2 we have presented the dynamics of both AT and DyART on the same pretrained models using the same batch of data at one iteration. In this section, we demonstrate the dynamics of the decision boundary throughout the whole training process. Experiment setting. To study the decision boundary dynamics throughout the training process, we train a ResNet-18 (He et al., 2016a) with group normalization (GN) (Wu & He, 2018) on CIFAR-10 using (1) Adversarial Training with 10-step PGD under ℓ ∞ perturbation with ϵ = 8 255 from scratch; (2) DyART with α = 8, λ = 400, r 0 = 16 255 from a naturally pretrained model. The models are trained with a initial learning rate of 0.01 and the learning rate decays to 0.001 at 20000 iteration. At each iteration, we compute the proportion of negative speed among points with margins smaller than 8 255 that are regarded as vulnerable. Conflicting dynamics throughout training In Figure 6 , the clean and robust accuracy of both methods are shown, along with the proportion of negative speed among vulnerable points. We apply curve smoothing for negative speed proportion plot for better visualization. Note that we omit the initial part of training (first 10 epochs) since at this initial stage, there are not enough correctly classified data points but speed and margin are only defined for these points. We can see that both methods exhibit some degree of robust overfitting, where the training robust accuracy becomes larger than the test robust accuracy. In addition, the conflicting dynamics exists throughout the whole training process, since the proportion of negative speed is never zero. We can see that DyART consistently has less conflicting dynamics than AT. Interestingly, the proportion of negative speed decreases over time during training for both methods. The connection between the decreasing degree of conflicting dynamics on the training data and the robust overfitting phenomenon is left for future research. 

D.5 RUN TIME ANALYSIS

In this section we provide the run time analysis. The main computational bottleneck for DyART is finding the closest boundary points, which is an iterative algorithm adapted from FAB. Each iteration costs one back-propagation, which is the same as Projected Gradient Descent (PGD). Once we find these closest boundary point candidates, we check if the KKT condition is approximately satisfied and filter out points that do not meet the KKT condition. The computational cost of this step takes one back-propagation. We use the torch.cuda.Event functionality in PyTorch to measure the execution time for one iteration of each method. In the case of DyART, this means measuring the total time of finding the closest boundary points and do back propagation using the full loss function. We use ResNet-18 with GroupNorm on a batch size of 128 on the CIFAR10 dataset. We use one NVDIA RTX A4000. The results are as follows: • 



MAIL's github AWP's github FAT's github UncoveringATLimits's Github UncoveringATLimits's Github



Figure 2: The curve of the closest boundary point x(t) (in blue) of the data point x.

Figure 3: Margin-speed plot of AT on a training batch. Among points with margins smaller than 8 255 , there are 28.8% and 29.4% points with negative speed on each pretrained model.

Figure 4: Exact decision boundary (in blue) for three classes (yellow, green and grey regions) and the soft decision boundary (in red) for the class of x.

in Appendix D.3.6.2 DYNAMICS OF DYARTIn this section we provide further insights into how DyART encourages the desirable dynamics by comparing it with adversarial training.Experimental setting. To compare the dynamics of the decision boundary during training using DyART and AT, we empirically compute the margins and speed values for both methods. For fair comparison, we run DyART and AT on the same pretrained models for one iteration on the same batch of training data points. The pretrained models include a partially trained model using natural training and a partially trained model using AT, which are the same as in Section 4.2. For all the correctly classified points in this batch, we compute the margins and speed values under both methods. Note that the speed and margins correspond to the exact decision boundary, instead of the soft decision boundary used by DyART for robust training. Since both methods train on the same model and the same batch of data at this iteration, the margins are the same and only the speed values differ, which corresponds to the difference in dynamics of the decision boundary.

Figure 5: Proportion of positive and negative speed values in each margin interval for AT and DyART on a naturally pretrained model and a partially robust model. Observe that DyART has lower proportion of negative speed for points with small margins (< 8 255 ).

CONCLUSIONS AND DISCUSSIONS This paper takes one more step towards understanding adversarial training by proposing a framework for studying the dynamics of the decision boundary. The phenomenon of conflicting dynamics is revealed, where the movement of decision boundary causes the margins of many vulnerable points to decrease and harms their robustness. To alleviate the conflicting dynamics, we propose Dynamics-Aware Robust Training (DyART) which prioritizes moving the decision boundary away from more vulnerable points and increasing their margins. Experiments on CIFAR-10 and Tiny-ImageNet demonstrate that DyART achieves improved robustness under various perturbation bounds. Future work includes (a) theoretical understanding of the dynamics of adversarial training; (b) developing more efficient numerical methods to find the closest boundary points for robust training.

point out that adversarial training leads to a superfluous increase in the margin along the adversarial directions, which can be a reason behind the trade-off between accuracy and robustness. Ortiz-Jimenez et al. (2020) investigate the relationship between data features and decision boundaries, and reveal several properties of CNNs and adversarial training. Their results show that adversarial training exploits the sensitivity and invariance of models to improve the robustness. Tramèr et al. (2020) studies invariance-based adversarial examples and expose a fundamental trade-off between commonly used sesitivity-based adversarial examples and the invariance-based ones, where the behaviors of decision boundaries are identified.

p = ∞ In this case, we consider the optimality condition given in Lemma 7 of Appendix B. Denote #B the number of points in a set B. Using the notation J = {j : |x j -x j | = ∥x -x∥ ∞ } and J C the complement set of J , we check the following conditions in practice: (a) |ϕ(x)| ≤ 0.1; (b) #{j∈J :∇xϕ y (x)j (xj -xj )≤0} #J > 0.9; (c) #{k / ∈J:|∇xϕ y (x) k |<0.1} #J C

Figure 6: The accuracy of AT and DyART as well as the proportion of negative speed among points whose margins are smaller than 8 255 .

Natural training: 46 ± 0.9 ms • AT (PGD-10 on Cross-Entropy loss): 531 ± 5.2 ms • TRADES (PGD-10 on KL divergence loss): 573 ± 2.8 ms • DyART (10 steps for finding the closest boundary point): 743 ± 10.3 ms • DyART (20 steps for finding the closest boundary point): 1171 ± 17.8 ms Developing faster algorithms for finding the closest boundary points is left for future research.

Clean and robust accuracy on CIFAR-10 under AA with different perturbation sizes on WRN-28-10.

Clean and robust accuracy on Tiny-ImageNet under AA with different perturbation sizes on ResNet-18.

First, we only require that xi (t) is a local closest boundary point. This is important because in practice when an algorithm for searching the closest boundary point is used (e.g. FAB), a local solution is the best one can hope for due to the non-convex nature of the optimization problem. When xi (t) is a local solution, the speed should be interpreted as how fast the distance changes around that local solution. In this case, although xi is not the globally closest adversarial example, the local speed around xi still has much information on the relative movement of the decision boundary w.r.t. x i , especially when the distance ∥x i -x i ∥ is relatively small and the input space is a high-dimensional space (e.g. pixel space).

Clean and robust accuracy on CIFAR-10 under AA with different perturbation sizes on WRN-28-10 with Group Normalization. The hyperparameters for DyART is α = 8, r0 = 16 255 , λ = 400.

, the overall comparison between DyART and baselines are demonstrated. Overall on both datasets, under four out of five perturbation bounds, DyART achieves the best robustness performance. This indicates the superiority of DyART in increasing margins. (1) Specifically, on CIFAR-10, DyART achieves the highest clean accuracy as well as robust accuracy under all perturbation bounds among all baselines except FAT-TRADES. (1a) Since FAT-TRADES prevents the model from learning on DyART 47.67 ± 0.15 38.19 ± 0.18 29.59 ± 0.14 17.79 ± 0.18 10.24 ± 0.13 5.41 ± 0.11 Clean and robust accuracy on Tiny-ImageNet under AA with different perturbation sizes on ResNet-18 with Group Normalization. The hyperparameters for DyART is α = 3, r0 = 20 255 , λ = 500.

Clean and robust accuracy on CIFAR-10 under AA with different perturbation bounds on WRN-28-10 with Group Normalization. The results on different sets of hyperparameters for DyART starts from the eighth row.

Clean and robust accuracy on Tiny-ImageNet under AA with different perturbation bounds on ResNet-18 with Group Normalization. The results on different sets of hyperparameters for DyART starts from the fourth row.More ablation on architectures with BN In Table7 and Table 8, we demonstrate results for more hyperparamter settings for experiments with Batch Normalization in Section 6.1. The role of each hyperparameter is similar to the GN case.

Clean and robust accuracy on CIFAR-10 under AA with different perturbation bounds on WRN-28-10 (with its original Batch Normalization layer). The results on different sets of hyperparameters for DyART starts from the seventh row.

Clean and robust accuracy on Tiny-ImageNet under AA with different perturbation bounds on ResNet-18 (with its original Batch Normalization layer). The results on different sets of hyperparameters for DyART starts from the fourth row.Performance of DyART with additional data The experimental results are shown in Table9. We have the following two observations: (1) Compared with the results in Table7, it is clear that the additional data can drastically improve the robust accuracy of DyART (about 6% boost in robust accuracy under ϵ = 8 255 ). (2) Compared with the state-of-the-art results in Rebuffi et al. (2021), DyART achieves higher clean accuracy and comparable robust accuracy under ϵ = 8 255 (Rebuffi et al. (2021) does not report robust accuracy under other perturbation bounds.).

Clean and robust accuracy on CIFAR-10 under l∞ AutoAttack with different perturbation sizes when 1M additional generated data from DDPM is used for training.

ACKNOWLEDGMENTS

The authors would like to thank Zhen Zhang, Chen Zhu and Wenxiao Wang for helpful discussions over the ideas. This work is supported by National Science Foundation NSF-IIS-FAI program, DOD-ONR-Office of Naval Research, DOD Air Force Office of Scientific Research, DOD-DARPA-Defense Advanced Research Projects Agency Guaranteeing AI Robustness against Deception (GARD), Adobe, Capital One and JP Morgan faculty fellowships.

D EXPERIMENTS

In this section, we provide the details of experimental settings and further results of DyART using various choices of hyperparameters. In addition, we provide experimental results when using additional data from the generated models. We also provide further analysis on the decision boundary dynamics.

D.1 DETAILED EXPERIMENTAL SETTINGS

Architectures and training settings. In all experiments on the CIFAR-10 dataset, we use the Wide Residual Network (Zagoruyko & Komodakis, 2016) with depth 28 and width factor 10 (WRN-28-10) with Swish activation function (Ramachandran et al., 2017) . On the Tiny-ImageNet dataset, we use pre-activation ResNet-18 (He et al., 2016b) . In all experiments, we use stochastic weight averaging (Izmailov et al., 2018) with a decay rate of 0.995 as in prior work (Gowal et al., 2020; Chen et al., 2020) . All models are trained using stochastic gradient descent with momentum 0.9 and weight decay 0.0005. We use a cosine learning rate schedule (Loshchilov & Hutter, 2016) without restarts where the initial learning rate is set to 0.1 for baselines. To alleviate robust overfitting (Rice et al., 2020) , we compute the robust and clean accuracy at every epoch on a validation set of size 1024 using projected gradient descent (PGD) attacks with 20 steps using margin loss function. All experiments are run on NVIDIA GeForce RTX 2080 Ti GPU.

Normalization layers

We consider two types of normalization layer in WRN-28-10 and ResNet-18, which are Batch Normalization (BN, used in their original architecture design) and Group Normalization (GN). When using GN, the decision boundaries are the same during training and evaluation, which is consistent with our theoretical analysis on the decision boundary dynamics. In the following sections, we will show the robustness performance on both cases: WRN-28-10 and ResNet-18 with BN and GN. We find that when applying DyART on original WRN-28-10 and ResNet-18 with BN, gradient clipping needs to be applied in order to learn the BN parameters stably. We apply gradient clipping with norm threshold 0.1 for experiments for CIFAR-10 on WRN-28-10 with BN and apply gradient clipping with norm threshold 1 for Tiny-ImageNet on ResNet-18 with BN. For experiments on architectures with GN, we do not apply gradient clipping. Note that for all experiments of computing speed and margins for interpretation (Section 4 and Section 6.2), we use the ResNet-18 with GN.Additional training settings For experiments with Group Normalization, models are run for 100 epochs on both datasets. For DyART on Tiny-ImageNet, we use the cosine learning rate schedule with initial learning rate 0.05 and on CIFAR-10, the learning rate begins at 0.1 and is decayed by a factor of 10 at the 50th and 75th epoch. For experiments with Batch Normalization, models are run for 200 epochs on CIFAR-10 and 100 epochs on Tiny-ImageNet. For DyART on both datasets, we use a cosine learning rate schedule (Loshchilov & Hutter, 2016) without restarts where the initial learning rate is set to 0.1, which is the same as the baselines.Compared baselines and their hyperparameters. In all experiments we consider the ℓ ∞ perturbation setting. On CIFAR-10, the baseline defense methods include: (1) standard adversarial training (AT) (Madry et al., 2017) 

