PROBABILISTIC CATEGORICAL ADVERSARIAL AT-TACK & ADVERSARIAL TRAINING

Abstract

The existence of adversarial examples brings huge concern for people to apply Deep Neural Networks (DNNs) in safety-critical tasks. However, how to generate adversarial examples with categorical data is an important problem but lack of extensive exploration. Previously established methods leverage greedy search method, which can be very time-consuming to conduct successful attack. This also limits the development of adversarial training and potential defenses for categorical data. To tackle this problem, we propose Probabilistic Categorical Adversarial Attack (PCAA), which transfers the discrete optimization problem to a continuous problem that can be solved efficiently by Projected Gradient Descent. In our paper, we theoretically analyze its optimality and time complexity to demonstrate its significant advantage over current greedy based attacks. Moreover, based on our attack, we propose an efficient adversarial training framework. Through a comprehensive empirical study, we justify the effectiveness of our proposed attack and defense algorithms.

1. INTRODUCTION

Adversarial attacks (Goodfellow et al., 2015) have raised great concerns for the applications of Deep Neural Networks(DNNs) in many security-critical domains (Cui et al., 2019; Stringhini et al., 2010; Cao & Tay, 2001) . The majority of existing methods focus on differentiable models and continuous input space, where we can apply gradient-based approaches to generate adversarial examples. However, there are many machine learning tasks where the input data are categorical. For example, data in ML-based intrusion detection systems (Khraisat et al., 2019 ) contains records of system operations; and in financial transaction systems, data includes categorical information such as the types of transactions. Therefore, how to explore potential attacks and corresponding defenses for categorical inputs is also desired. Existing methods introduce search-based approaches for categorical adversarial attacks (Yang et al., 2020b; Lei et al., 2019a) . For example, the method in (Yang et al., 2020a) first finds top-K features of a given sample that have the maximal influence on the model output, and then, a greedy search is applied to obtain the optimal combination of perturbation in these K features. However, these search-based methods cannot be guaranteed to find the strongest adversarial examples. Moreover, they can be computationally expensive, especially when data is high dimensional and the number of categories for each feature is large. In this paper, we propose a novel Probabilistic Categorical Adversarial Attack (PCAA) algorithm to generate categorical adversarial examples by estimating their probabilistic distribution. In detail, given a clean sample, we assume that (each feature of) the adversarial example follows a categorical distribution, and satisfies: (1) the generated samples following this distribution have a high expected loss value and (2) the generated samples only have a few features which are different from the original clean sample. (See Section 3 for more details.) In this way, we transfer the categorical adversarial attack in the discrete space to an optimization problem in a continuous probabilistic space. Thus, we are able to apply gradient-based methods such as (Madry et al., 2017) to find adversarial examples. On one hand, the distribution of adversarial examples in PCAA is searched in the whole space of allowed perturbations. This can facilitate our method to find stronger adversarial examples (with higher loss value) than the greedy search methods (Yang et al., 2020b) . Moreover, when the dimension of input data expands, the increase of computational cost of PCAA will be significantly slower than search-based methods (Section 3.4). Therefore, our method can enjoy good attacking optimality and computational efficiency simultaneously. For example, in our experiments in Section 5.1, our PCAA method is the only attacking method that has the highest (or close to highest) attacking successful rate while maintaining the low computational cost. The advantages of PCAA allow us to further devise an adversarial training method (PAdvT), by repeatedly (by mini-batches) generating adversarial examples using PCAA. Empirically, PAdvT achieves promising robustness on different datasets over various attacks. For example, on AG's news dataset, we outperform representative baselines with significant margins, approximately 10% improvement in model robustness compared with the best defense baseline; on IMDB dataset, we obtain comparable performance with ASCC-defense (Dong et al., 2021a) , which is a SOTA defense method via embedding space adversarial training. However, compared to ASCC, our method does not rely on the assumption that similar words have a close distance in the embedding space. Compared to other defenses, our defense has much better robustness by more than 15%. Our main contributions can be summarized below: • We propose a time-efficient probabilistic attacking method (PCAA) for models with categorical input. • Based on PCAA, we devise a probabilistic adversarial training method to defend categorical adversarial attacks.

2. RELATED WORK

Attacks on categorical data. There has been a rise in the importance of the robustness of machine learning in recent years. On the one hand, evasion attack, poison attack, adversarial training, and other robustness problem with continuous input space have been well studied especially in the image domain (Shafahi et al., 2018; Madry et al., 2017; Ilyas et al., 2019) . On the other hand, adversarial attacks focusing on discrete input data, like text data, which have categorical features, are also starting to catch the attention of researchers. Kuleshov et al. (2018) discussed the problems of attacking text data and highlighted the importance of investment into discrete input data. Ebrahimi et al. (2017b) proposed to modify the text token based on the gradient of input one-hot vectors. Gao et al. (2018) developed a scoring function to select the most effective attack and a simple characterlevel transformation to replace projected gradient or multiple linguistic-driven steps on text data. Samanta & Mehta (2017) proposed an algorithm to generate a meaningful adversarial sample that is legitimate in the text domain. Defenses on categorical data. There are also a lot of defense methods in the continuous data domain compared with discrete data. For example, the most effective one is adversarial training Madry et al. (2017) , which searches for the worst case by PGD during the training and trains on the worst adversarial examples to increase the robustness of the models. Several works have been proposed for categorical adversarial defenses. Pruthi et al. (2019) used a word recognition model to preprocess the input data and increased the robustness of downstream tasks. Zhou et al. (2020) proposed to use random smoothing to defense substitution-based attacks. Wang et al. (2021) detected adversarial examples to defend synonym substitution attacks based on the fact that word replacement can destroy mutual interaction. Swenor & Kalita (2022) used random noise to increase the model robustness. Xie et al. (2022) used a huge amount of data to detect the adversarial examples. Dong et al. (2021b) combined different attack methods, such as Hot-flip (Ebrahimi et al., 2017a) and l 2 -attack (Miyato et al., 2017) , and adversarial training as the defense method. It also proposed its own adversarial method ASCC which takes the solution space as a convex hull of word vectors and achieves good performance in sentiment analysis and natural language inference. However, the defense methods are mainly focusing on NLP, which may rely on word embeddings. 3 PROBABLISTIC CATEGORICAL ADVERSARIAL ATTACK (PCAA) 3.1 PROBLEM SETUP Categorical Attack. We first introduce necessary definitions and notations. In detail, we consider a classifier f that predicts labels y ∈ Y based on categorical inputs x ∈ X . Each input sample x contains n categorical features, and each feature x i take a value from d categorical values. In this paper, we consider l 0 -attack, which restricts the number of perturbed features. In particular, given the budget size ϵ, we aim to find an adversarial example x ′ which solves the following optimization problem: max L(f (x ′ ), y) s.t. ∥x ′ -x∥ 0 ≤ ϵ. Existing search-based methods. To solve the problem, there existing search-based methods such as Greedy Search (Yang et al., 2020a) and Gradient-Guided Search (Lei et al., 2019a) . In general, they consist of two major steps. First, they search for features in x whose change can result in the greatest influence on the model output. This can be estimated by either perturbing a feature x i , or calculating the scale of the model's gradient to the feature x i . The second step involves greedy or global searches among all possible combinations of the features identified by the first step. This process is time-consuming if the feature dimension n or the number of categories d is large.

3.2. OBJECTIVE OF PROBABILISTIC CATEGORICAL ADVERSARIAL ATTACK (PCAA)

Probabilistic Categorical Attack. In this work, we propose an alternative approach to solve the problem in Eq.( 1), by transferring it to the continuous probabilistic space. In detail, we assume that each feature of (adversarial) categorical data x ′ i follows a categorical distribution: Categorical(π i ), where π i ∈ Π i = (0, 1) d . And each π i,j represents the probability that the feature i belongs to the category j. In the remaining of the paper, we will use π i to denote the categorical distribution Categorical(π i ) without the loss of generality. Therefore, each input sample x's distribution can be represented as an array π = [π 0 ; π 1 ; ...; π n ] ∈ Π ⊂ R n×d , where the element π i.j represents the probability that the feature x i belongs to the category j. Then, we define the following optimization problem to find a probability distribution π in the space of Π: max π∈Π E x ′ ∼π L(f (x ′ ), y), s.t. Pr x ′ ∼π (∥x ′ -x∥ 0 ≥ ϵ) ≤ δ (2) where ϵ denotes the perturbation budget size and δ is the tail probability constraint. By solving the problem in Eq.( 2), we aim to find a distribution with parameter π such that: (1) on average, the generated samples x ′ following distribution π have a high loss value; and (2) with low probability, the sample x ′ has a l 0 distance to the clean sample x larger than ϵ. In this way, the generated samples x ′ are likely to mislead the model prediction while preserving most features of x.

3.3. AN EFFICIENT ALGORITHM OF PCAA

In this subsection, we provide a feasible algorithm to solve our proposed Eq.(2) in practice. First, in order to handle the probability constraint in Eq.( 2), we substitute the l 0 distance between x ′ and x by calculating the sum of Cross Entropy Loss between π i and x i : L CE (π i , x i ) for all features i ∈ |x|. It is because L CE (π i , x i ) measures the probability that the i-th feature of the generated samples is different from x i . Therefore, we use the sum of Cross Entropy i∈|n| L CE (π i , x i ) to approximate the number of changed features, which is the l 0 difference ||x ′ -x|| 0 . In our algorithm, we penalize the searched π when the term i∈|n| L CE (π i , x i ) exceeds a positive value ζ (as Eq. 3). In this case, we equivalently limit the probability that the generated samples x ′ have the number of perturbed features larger than ϵ. Pr x ′ ∼π (||x ′ -x|| ≥ ϵ) →   i∈|n| L CE (x i , π i ) -ζ   + (3) Moreover, since the Cross-Entropy Loss is differentiable in terms of π, we further transform the problem to its Lagrangian form: max π E x ′ ∼π L(f (x ′ ), y) -λ   i∈|n| L CE (x i , π i ) -ζ   + ( ) where λ is the penalty coefficient, and [•] + is max(•, 0). Next, we will show how to solve the maximization problem above by applying gradient methods. Back propagation through Gumbel-Softmax. Note that the gradient of the expected loss function with respect to π cannot be directly calculated in Eq. 4, so we apply the Gumbel-Softmax estimator Jang et al. (2017) . In practice, to avoid the projection onto probability simplex which hurts the time efficiency and Gumbel-Softmax does not require a normalized probability distribution, we consider an unnormalized categorical distribution π i ∈ (0, C] d where C > 0 is a large constant guaranteeing that the searching space is sufficiently large. The distribution generates sample vectors x ′ i as follows: x ′ ij = exp((log π ij + g j )/τ ) d j=1 exp((log π ij + g j )/τ ) , for j = 1, ..., d where g j denotes i.i.d samples from the Gumbel(0, 1) distribution, and τ denotes the softmax temperature. This re-parameterization process facilitates us to calculate the gradient of the expected loss in terms of π. Therefore, we can derive the following estimator of gradients for the expected loss: ∂E x ′ ∼π L(f (x ′ ), y) ∂π ≈ ∂ ∂π E g L(f (x ′ (π, g)), y) = E g ∂L ∂x ′ ∂x ′ ∂π ≈ 1 n g ng i=1 ∂L ∂x ′ ∂x ′ (π, g i ) ∂π (6) where n g is the number of i.i.d samples from g. In Eq. 6, the first approximation is from the reparameterization of a sample x ′ ; the second equality comes from exchanging the order of expectation and derivative, and the third, we approximate the expectation of gradients by calculating the average of gradients. Finally, we derive the practical solution to solve Eq. 4 as demonstrated in Figure 1 , by leveraging the gradient ascent algorithm, such as Madry et al. (2018) . In Algorithm 1, we provide the details of our proposed attack method. Specifically, in each step, we first update the unnormalized distribution π by gradient ascent. Then, we clip π back to its domain (0, C] d . Algorithm 1 Probabilistic Categorical Adversarial Attack (PCAA) Input Data D, budget ϵ, number of samples n g , penalty coefficient λ, maximum iteration I, learning rate γ Output Adversarial Distribution π Initialize distribution π 0 for t ≤ I do Estimate expected gradient using Eq6: ∇ π E π L ≈ 1 ng ng i=1 ∂L ∂x ′ ∂x ′ (π t ,gi) ∂π Gradient ascent: π t+1 = π t + γ • ∇ π (E π L -λ∇ π [L CE (π t , x) -ζ] + ) Clip to (0, C] d : π t+1 = max( π t+1 , C) end for Return distribution π

3.4. TIME COMPLEXITY ANALYSIS

In this subsection, we compare the time complexity of our method with search-based methods to illustrate the efficiency advantage of our proposed attack method. Here, we assume that the whole dataset has N data points, each data point has n features, each feature has d categories and the budget of allowed perturbation is ϵ. We first introduce the details of four different search-based methods. • Greedy Search(GS). It consists of two stages. The first stage involves traversing all features. For the i th feature, it replaces the original category with all other d -1 categories respectively and records the change of the model loss for each category. The largest change is treated as the impact score for the i th feature. Then it selects the top ϵ features with the highest impact scores to perturb. In the second stage, it finds the combination with the greatest loss among all possible combinations of categories for selected features. • Greedy attack.(GA) (Yang et al., 2020a) . This method is a modified version of Greedy Search. The first stage is similar to that of GS, while the second stage searches for the best perturbation feature by feature. For the i th selected feature, it replaces the original category with one that results in the largest loss and then searches the next selected feature until all selected features are traversed. Therefore, it only needs to go over each feature once without traversing all combinations in the second stage. • Gradient-guided GS.(GGS) (Lei et al., 2019a) . In order to determine which features should be perturbed, this method utilizes gradient information in the first stage. It computes the gradient of the loss function w.r.t the original input and treats the gradient of each feature as the impact score. Those ϵ features with the greatest impact scores are selected to be perturbed. In the second stage, it follows the same strategy as that of the second stage in GS, and pursues the combination resulting in the greatest loss among all possible combinations. • Gradient-guided greedy attack(GGA). On the basis of GGS, it remains the same first stage and modifies the second stage by adopting the same strategy as that in the second stage of GA. We analyze the time complexity of our PCAA and the aforementioned methods for comparison. Probability Categorical Adversarial Attack. In practical implementation, we adopt the unconstrained optimization in Eq. 4. Therefore, the time complexity is only from gradient descent. Further, assume that we sample n s times when estimating the expected gradient and the maximum number of iterations is I. We need to compute gradient n s times during one iteration, thus the time complexity of PCAA is: N • n s • O(nd) • I = C 1 N O(nd) where C 1 is some constant related to n s and I. Greedy search. As described above, the Greedy search consists of two stages. In the first stage, it needs to traverse all n features and compute a loss when plugging in each level. Then n • d times of calculation are needed. In the second stage, all possible combinations of categories for top ϵ features need to be considered and the number of possible combinations is d ϵ . Therefore, the time complexity for Greedy Search is N • [Ω(stage1) + Ω(stage2)] = N • [O(nd) + O(d ϵ )] = N • O(nd + d ϵ ) Greedy attack. In the first stage, it traverses all categories within each feature to find the best ϵ features to perturb, which needs n • d computation. And in the second stage, it searches ϵ selected features respectively for the d categories that lead to the greatest loss. Thus, the time complexity is N • [Ω(stage1) + Ω(stage2)] = N • [O(nd) + O(ϵd)] = N • O(nd + ϵd) Gradient guided greedy search. In the first stage, it computes gradients w.r.t the original input, and in the second stage, it also needs to traverse all possible combinations for selected features. Therefore the time complexity is N • [Ω(stage1) + Ω(stage2)] = N • [O(n) + O(d ϵ )] = N • O(n + d ϵ ) Gradient-guided greedy attack. This method combines the first stage in GGS and the second stage in GA. Thus its time complexity is N • [Ω(stage1) + Ω(stage2)] = N • [O(n) + O(ϵd)] = N • O(n + ϵd) From the above analysis, two search-based methods suffer from the exponential increase of time complexity (8 10) when the number of feature categories d and budget size ϵ is increasing. Moreover, two greedy-based methods (GA and GGA) accelerate the second stage and achieve better time efficiency, but sacrifice the performance as they greatly narrow down the searching space. However, PCAA is free of these problems, since it achieves great time efficiency to greedy-based methods while remain comparable performance with search-based methods. We will show this with more evidence in the experimental part. 

4. PROBABILISTIC ADVERSARIAL TRAINING (PADVT)

min θ   max π E x ′ ∼π   L(f (x ′ ; θ), y) -λ   i∈|n| L CE (x i , π i ) -ζ   +     Since our objective involves a penalty coefficient, we adopt the strategy in Yurochkin et al. (2020) to update λ during training. We adaptively choose λ according to L CE (x, π)-ϵ from the last iteration: when the value is large, we increase λ to strengthen the constraints and vice versa.  λ = max{0, λ -α(ζ -1 m i∈[m] j∈[n] L CE (x i j , π i j ))} until Training converged Return parameters θ

5. EXPERIMENT

In this section, we conduct experiments to validate the effectiveness and efficiency of PCAA and PAdvT. In Section 5.1, we demonstrate that PCAA achieves a better balance between attack success rate and time efficiency. In Section. 5.2, PAdvT achieves competitive or stronger robustness against categorical attacks across different tasks.

5.1. CATEGORICAL ADVERSARIAL ATTACKS

Experimental Setup. In this evaluation, we focus on three datasets.(1) Intrusion Prevention System (IPS) (Wang et al., 2020) . IPS dataset has 242,467 instances, with each input consisting of 20 features and each feature has 1,103 categorical values. The output space has three labels. A standard LSTM based classifier (Bao et al., 2022) is trained for IPS dataset. (2) AG's News corpus. This dataset consists of titles and description fields of news articles. The tokens of each sentence correspond to the categorical features, and the substitution set (of size 70) corresponds to the categorical values. A character-based CNN (Zhang et al., 2015) is trained on this dataset. (3) Splice-junction Gene Sequences (Splice) (Noordewier et al., 1990) . Splice dataset has 3,190 instances. Each one Baseline Attacks. Our goal is to generate powerful attacks on categorical models directly on input space, without relying on the embedding space. Thus, search-based methods are most suitable in this setting. Therefore, we compare PCAA with the following search-based attacks including Greedy Search(GS), Greedy Attack(GA), Gradient-guided GS (GGS) and Gradient-guided greedy attack(GGA). The details of these baselines can be found in Section 3.4. For each dataset, we evaluate the performance in terms of the attack success rate (SR.) and the average running time (T.) under various budget sizes ϵ ranging from 1 to 5. Since PCAA learns the adversarial categorical distribution, the generation of adversarial examples is based on sampling. In the evaluation, we sample 100 examples from the adversarial distribution for each attack instance and dismiss those with more perturbed features than the budget size. We claim a success attack if one out of all samples successfully changes the prediction. The average running time for PCAA includes both the optimization of Eq. 4 and the sampling process. Performance Comparison. The experimental results on IPS, AG's news, and Splice datasets are demonstrated in Table 1 . (1) In IPS dataset, each data contains more than 1000 categories for each feature, and our method has a significant advantage. As the budget size increases, GS and GGS run so long that they are no longer feasible. While GA and GGA remain efficient, our method outperforms them by significant margins, e.g., over 10% higher than GGA in success rate and faster over 7 times than GA in time. (2) In AG's news data , GS has the highest success rate with small budgets. However, our method PCAA has much better time efficiency than GS under all budgets while maintaining competitive performance and outperforming all the other attacks. (3) In Splice, there are only 5 categories for each feature, which corresponds to the low-dimensional cases. The advantage of our method may not be as great as in other cases (i.e., IPS and AG) as the running time of greedy methods is acceptable. However, PCAA still generates strong attacks with only 6% less than GS in success rate, and remains highly efficient when the budget size increases, so it still provides a practical and competitive attack.

Time Efficiency Comparison.

It is evident from Table 1 that PCAA's average running time is unaffected by the size of the budget, whereas greedy methods are clearly more time-consuming as budgets increase. Thus, the proposed PCAA method significantly improves time efficiency. As shown in Table1, when the substitution set is large (e.g., IPS and AG), the running time of GS and GGS will explode even under some moderate budget size such as ϵ = 3. Since the running time is too long for meaningful comparison, we do not record the performance for these budget sizes of those baselines. In small substitution sets, i.e., Splice, the time efficiency gap between PCAA and greedy methods is not large. Due to the small size of the substitution set, greedy methods need much fewer quires than in high-dimensional cases, but PCAA still takes a much shorter time for a budget size of more than 4. Experimental Setup. For the defense evaluation, we focus on two datasets, AG's News Corpus and IMDB. It is because IPS and Splice have too few samples (no more than 1000 for each dataset). In particular: (1) AG's News corpus. is the same dataset used in the attack evaluation and the model is also a character-based CNN. As character swapping does not require embeddings for each character, we can apply attacking methods on input space. Therefore, the robustness of the defense models is evaluated using six attacks,i.e., Hot-Flip (Ebrahimi et al., 2017a) , GS, GA, GGS, GGA, and PCAA. (2) IMDB reviews dataset (Maas et al., 2011) . Under this dataset, we focus on a word-level classification task and we study two model architectures, namely Bi-LSTM and CNN, are trained for prediction. To evaluate the robustness, four attacks are deployed, including a genetic attack (Alzantot et al., 2018) (which is an attack method proposed to generate adversarial examples in embedding space), as well as GS, GGS, and PCAA. Baseline Defenses. We compare our PAdvT with following baseline defenses: • Standard training. It minimizes the average cross-entropy loss on clean input. • Hot-flip (Ebrahimi et al., 2017a) . It uses the gradient with respect to the one-hot input representation to find out the which individual feature under perturbation has the highest estimated loss. It is initially proposed to model char flip in Char-CNN model, and we also apply it to word-level substitution, as in (Dong et al., 2021a) . • Adv l 2 -ball (Miyato et al., 2017) . It uses an l 2 PGD adversarial attack inside the word embedding space for adversarial training. • ASCC-Defense (Dong et al., 2021a) . A state-of-the-art defense method in text classification. It uses the worst perturbation over a sparse convex hull in the word embedding space for adversarial training. Performance Comparison. The experimental results on AG's news dataset are shown in Fig. 2 . The Y-axis represents the error rate and X-axis represents different defense methods while multirows represents different attacking methods. Our defense method achieves leading robustness on Char-CNN over all attacks with significant margins, surpassing Hot-Flip-defense by 10%. Fig. 3 illustrates the defense results on the IMDB dataset, and we have similar observations with Fig. 2 . Our PAdvT shows competitive adversarial robustness as ASCC-defense. Notably, ASCC is a defense method that conducts adversarial training on word-embedding space. It relies on the key assumption that similar words have a close distance in embedding space. However, our method does not rely on this assumption, which may result in the performance being competitive (slightly worse) than ASCC. For all other defenses, it outperforms them across different architectures significantly.

5.3. ABLATION STUDY

Concentration of PCAA. To further understand the behavior of our attack algorithm, in this subsection, we ask the question: what is the variance of our optimized probability distribution π * (from solving Eq.2))? Intuitively, we desire the distribution π to have smaller variance, so that we don't need too many times of sampling to obtain the optimal adversarial examples. To confirm this point, we conduct an ablation study based on an experiment on IPS dataset to visualize the distribution π, which is optimized via PCAA under various budget sizes. In Fig 4 , we choose three budget sizes ϵ = 1, 3, 5 and choose 4 features to present the adversarial categorical distributions. Notably, in Fig. 4 , the left two columns are the features that are not perturbed. The right two columns are features which will be perturbed by PCAA. From the figure, we can see that for each feature (perturbed or unperturbed), there exists one category with a much higher probability compared to other categories. This fact indicates during the sampling process of PCAA, for a certain feature, the samplings are highly likely to have the same category for this feature. As a result, we confirm that our sampled adversarial examples are well-concentrated. Ablation analysis for PAdvT. In our training objective in Eq. 12, ζ controls the budget size used for adversairal training and possibly affects the robustness of the model. We conduct an ablation study on the IMDB dataset to understand the impact of ζ. The results are demonstrated in Table . 2. When ζ increases, the success rates of all attacks decreases, meaning that the robustness of models is enhanced. However, large ζ will increases the clean error and sacrifice the clean accuracy of models. Thus, ζ controls the balance between the clean accuracy and the adversarial robustness of the model. When ζ = 0.4, our algorithm reaches a good balance between model accuracy and robustness. 

6. CONCLUSION

In this paper, we propose a novel probabilistic optimization formulation for attacking models with categorical input. Our framework significantly improves the time efficiency compared with greedy methods and shows promising empirical performance across different datasets. Furthermore, we design an adversarial training method based on the proposed probabilistic attack that achieves better robustness.

A APPENDIX

A.1 ADDITIONAL EXPERIMENTAL RESULTS In this subsection, we present the results of Fig. 2 and 3 where the values represent error rates. We run each experiment for 5 times and compute the 95% confidence interval. We also run PAdvT with mixture of adversarial examples and clean samples on IMDB dataset. Results are shown in Table 5 where values represent error rates. It is noticeable that clean samples will slightly improve the clean performance and lead to small decreasing of robustness. 



Figure 1: The overall framework of Probabilistic Categorical Adversarial Attack. First, a distribution π is used to sample adversarial examples x ′ . Then, the gradient of loss function will be used to update the probability distribution π.

Figure 3: PAdvT and baseline defense performances under different attacks on IMDB dataset.

Adversarial trainingSzegedy et al. (2014)Goodfellow et al. (2015)Madry et al. (2018) is one of the most effective methods to build robust models. Its training incorporates adversarial examples to improve robustness. However, greedy methods have limited applications in adversarial training when dealing with categorical input models due to their slow speed. The proposed PCAA method significantly improves time efficiency. In this case, we propose Probabilistic Adversarial Training (PAdvT) based on PCAA in order to train robust models on categorical data. Recalling the formulation of PCAA in Eq. 4, and denoting the parameters for classifier f with θ, the training objective for PAdvT is formulated as

The implementation of PAdvT is illustrated in Algorithm. 2. It first solves the inner maximization problem by applying algorithm. 1. Then it samples n adv examples from the adversarial distribution π. Finally, it applies AdamKingma & Ba (2015) to solve the outer minimization problem. The process will continue until the maximal number of iterations is reached. Sample n adv examples {x ′i 1 , ..., X ′i n adv } from π i using Gumbel Softmax tricks end for Update θ by Adam to minimize the average adversarial loss Update

Attacking results on IPS, AG's news, and Splice datasets. SR. represents the attack success rate; T. represents the average running time in seconds; and "-" indicates the running time over 10 hours. Each experiment runs 5 times and 95% confidence intervals are shown.

Ablation study: impact of the budget regularization term ζ on PAdvT

PAdvT and baseline defense performances under different attacks on IMDB dataset

PAdvT and baseline defense performances under different attacks on AG's news dataset

Comparison of PAdvT on IMDB dataset with/without mixture of clean samples.We run PCAA attack on IMDB dataset over two victim models LSTM and word-CNN. The candidate sets are pre-specified synonym set. The following Tables6 and 7show some successful adversarial examples, where the red words are adversarial words and blue are original words. It is obvious that these replacements do not hurt the semantic meaning but can fool the classifiers. The film expects(looks) great and the animation is sometimes jaw dropping. The film isn't too terribly original. It 's basically a modern take on kurosawa 's seven samurai only with bugs, I enjoyed the character interaction however, and the naughty boys(bad guys) in this film actually seemed bad. It seems that Disney usually makes their bad guys carbon copy cut outs, the grasshoppers are menacing and hopper the lead bad guy was a brilliant creation. Check this one out.

IMDB Adversarial Examples from PCAA on word-CNN Class Perturbed Class Perturbed Texts Positive Negative I am a college student studying A levels and need help and comments from anyone who has any views at all about the theme of mothers in film. In The Mother, whether you have gone through something similar or just want to comment and help me research more about this film, any comment would much greatly appreciated. The comments will be used alone(solely) for exam purposes and will be included in my written exam. So if you have any views at all I'm convinced(sure) I can put them to use and you could help me get an A. I am also studying about a boy and tadpole. So if you have seen these films as well, I would appreciate it if you could leave comments on here on that page. Thank you. Negative Positive This movie is so horrendous(awful). It is hard to find the right words to describe it. At first the story is so ridiculous. A narrow minded human can write a better plot. The actors are boring and untalented. Perhaps they were compelled to play in this dorky(cheesy) film. The camera receptions of the national forest are the only good in this whole movie. I should feel ashame because I paid for this lousy picture. Hopefully nobody makes a sequel or make a similar film with such a worse storyline. Positive Negative This movie is wonderful, the writing, directing, acting, all are marvelous(fantastic). Very witty and clever script quality performances by actors. Ally Sheedy is strong and dynamic and delightfully quirky really original and heart warmingly unpredicatable. The scenes are alive with fresh energy and really talented generating(production) Positive Negative This may not be war peace but the two academy noms wouldn't have been forthcoming. If it weren't for the genius of James Wong Howe, this is one of the few films I've fallen in love with as a infant(child) and gone back to without dissatisfaction. Whether you have any interest in what it offers fictively or not, BBC is a visual feast. I'm not saying it's his best work. I'm no expert there for sure but the look of this movie is astounding(amazing). I love everything about it, Elsa Lanchester, the cat, the crazy hoodoo, the retro downtown Ness, but the way it was put on film is breathtaking. I even like the inconsistencies pointed out on this page aforementioned(above) and the special effects that seem backward. Now it all creates a really consistent world. Positive Negative Bette Midler is again divine raunchily hilarious(humorous) in love with burlesque, capable of bringing you down to tears either with old jokes, with new dresses or merely with old songs, with more power punch than ever. All in all, sung(singing) new ballads power, singing the good old perennial ones such as the rose 'stay with me' and yes even 'wind beneath my wings'. The best way to appreciate the Divine Miss M has always been libe since this is the next best thing to it. I strongly recommended to all with a mixture of adult extensive(wide) eyed enchantment and appreciation and a child 's mischievous wish for pushing all boundaries.

