SMOOTHED-SGDMAX: A STABILITY-INSPIRED AL-GORITHM TO IMPROVE ADVERSARIAL GENERALIZA-TION

Abstract

Unlike standard training, deep neural networks can suffer from serious overfitting problems in adversarial settings. Recent research (Xing et al., 2021b; Xiao et al., 2022) suggested that adversarial training can have nonvanishing generalization error even if the sample size n goes to infinity. A natural question arises: can we eliminate the generalization error floor in adversarial training? This paper gives an affirmative answer. First, by an adaptation of information-theoretical lower bound on the complexity of solving Lipschitz-convex problems using randomized algorithms, we establish a minimax lower bound Ω(s(T )/n) given a training loss of 1/s(T ) for the generalization gap in non-smooth settings, where T is the number of iterations, and s(T ) → +∞ as T → +∞. Next, by observing that the nonvanishing generalization error of existing adversarial training algorithms comes from the non-smoothness of the adversarial loss function, we employ a smoothing technique to smooth the adversarial loss function. Based on the smoothed loss function, we prove that a smoothed version of SGDmax algorithm can achieve a generalization bound O(s(T )/n), which eliminates the generalization error floor and matches the minimax lower bound. Experimentally, we show that the Smoothed-SGDmax algorithm improves adversarial generalization on common datasets.

1. INTRODUCTION

Deep neural networks (DNNs) (Krizhevsky et al., 2012; Hochreiter & Schmidhuber, 1997 ) is successful and rarely suffered overfitting issues (Zhang et al., 2021) . This phenomenon is also called benign overfitting. A well-trained neural network model can generalize well to the test data. However, in adversarial machine learning, overfitting becomes a serious issue (Rice et al., 2020) . Before the training algorithms converge, the robust test error starts to increase. This special type of overfitting is called robust overfitting and can be observed in the experiments on common datasets. See Fig. 1 , orange curve. Therefore, mitigating the robust overfitting is important to increase the adversarial robustness of a DNN model. Several recent works tried to figure out the causes of robust overfitting and designed methods to mitigate it. See the discussion in Sec. 2. A recent line of work (Xing et al., 2021b; Xiao et al., 2022) studied the robust overfitting issue of adversarial training from a theoretical perspective, using the notion of uniform algorithmic stability. Uniform algorithmic stability (UAS) (Bousquet & Elisseeff, 2002) was introduced to bound the generalization gap in machine learning problems. It provides algorithm-specific generalization bounds instead of algorithm-free generalization bounds such as classical results on VC-dimension (Vapnik & Chervonenkis, 2015) and Rademacher complexity (Bartlett & Mendelson, 2002) . Such stability-based generalization bounds provide insight into understanding the generalization ability of neural network models trained by different algorithms. Traditional adversarial training is to perform stochastic gradient descent (SGD) on the max function of the standard counterpart, which is also called SGDmax (Farnia & Ozdaglar, 2021 ). We will not distinguish two algorithms, "SGDmax" and "adversarial training (AT)", in the paper. The work of (Xing et al., 2021b; Xiao et al., 2022) both showed that SGDmax incurs a stability-based generalization bound in O(c(T ) + s(T )/n). Here T is the number of iterations, n is the number of samples, s(T ) is a function satisfies s(T ) → +∞ as T → +∞, and c(T ) is a sample size-independent

