ON FAST ADVERSARIAL ROBUSTNESS ADAPTATION IN MODEL-AGNOSTIC META-LEARNING

Abstract

Model-agnostic meta-learning (MAML) has emerged as one of the most successful meta-learning techniques in few-shot learning. It enables us to learn a meta-initialization of model parameters (that we call meta-model) to rapidly adapt to new tasks using a small amount of labeled training data. Despite the generalization power of the meta-model, it remains elusive that how adversarial robustness can be maintained by MAML in few-shot learning. In addition to generalization, robustness is also desired for a meta-model to defend adversarial examples (attacks). Toward promoting adversarial robustness in MAML, we first study when a robustness-promoting regularization should be incorporated, given the fact that MAML adopts a bi-level (fine-tuning vs. meta-update) learning procedure. We show that robustifying the meta-update stage is sufficient to make robustness adapted to the task-specific fine-tuning stage even if the latter uses a standard training protocol. We also make additional justification on the acquired robustness adaptation by peering into the interpretability of neurons' activation maps. Furthermore, we investigate how robust regularization can efficiently be designed in MAML. We propose a general but easily-optimized robustness-regularized meta-learning framework, which allows the use of unlabeled data augmentation, fast adversarial attack generation, and computationally-light fine-tuning. In particular, we for the first time show that the auxiliary contrastive learning task can enhance the adversarial robustness of MAML. Finally, extensive experiments are conducted to demonstrate the effectiveness of our proposed methods in robust few-shot learning. Codes are available at https://github.com/wangren09/MetaAdv.

1. INTRODUCTION

Meta-learning, which can offer fast generalization adaptation to unseen tasks (Thrun & Pratt, 2012; Novak & Gowin, 1984) , has widely been studied from model-and metric-based methods (Santoro et al., 2016; Munkhdalai & Yu, 2017; Koch et al., 2015; Snell et al., 2017) to optimizationbased methods (Ravi & Larochelle, 2016; Finn et al., 2017; Nichol et al., 2018) . In particular, model-agnostic meta-learning (MAML) (Finn et al., 2017) is one of the most intriguing bi-level optimization-based meta-learning methods designed for fast-adapted few-shot learning. That is, the learnt meta-model can rapidly be generalized to unforeseen tasks with only a small amount of data. It has successfully been applied to use cases such as object detection (Wang et al., 2020) , medical image analysis (Maicas et al., 2018) , and language modeling (Huang et al., 2018) . In addition to generalization-ability, recent works (Yin et al., 2018; Goldblum et al., 2019; Xu et al., 2020) investigated MAML from another fundamental perspective, adversarial robustness, given by the capabilities of a model defending against adversarially perturbed inputs (known as adversarial examples/attacks) (Goodfellow et al., 2014; Xu et al., 2019b) . The challenge of lacking robustness of deep learning (DL) models has gained increasing interest and attention. And there exists a proactive arm race between adversarial attack and defense; see overview in (Carlini et al., 2019; Hao-Chen et al., 2020) . There have existed many defensive methods in the context of standard model training, e.g., (Madry et al., 2017; Zhang et al., 2019b; Wong et al., 2020; Carmon et al., 2019; Stanforth et al., 2019; Xu et al., 2019a) , however, few work studied robust MAML except (Yin et al., 2018; Goldblum et al., 2019) to the best of our knowledge. And tackling such a problem is more challenging than robustifying the standard model training, since MAML contains a bi-leveled learning procedure in which the meta-update step (outer loop) optimizes a task-agnostic initialization of model parameters while the fine-tuning step (inner loop) learns a task-specific model instantization updated from the common initialization. Thus, it remains elusive when (namely, at which learning stage) and how robust regularization should be promoted to strike a graceful balance between generalization/robustness and computation efficiency. Note that neither the standard MAML (Finn et al., 2017) nor the standard robust training (Madry et al., 2017; Zhang et al., 2019b) is as easy as normal training. Besides the algorithmic design in robust MAML, it is also important to draw in-depth explanation and analysis on why adversarial robustness can efficiently be gained in MAML. In this work, we aim to re-visit the problem of adversarial robustness in MAML (Yin et al., 2018; Goldblum et al., 2019) and make affirmative answers to the above questions on when, how and why. Contributions Compared to the existing works (Yin et al., 2018; Goldblum et al., 2019) , we make the following contributions: • Given the fact that MAML is formed as a bi-level learning procedure, we show and explain why regularizing adversarial robustness at the meta-update level is sufficient to offer fast and effective robustness adaptation on few-shot test tasks. • Given the fact that either MAML or robust training alone is computationally intensive, we propose a general but efficient robustness-regularized meta-learning framework, which allows the use of unlabeled data augmentation, fast (one-step) adversarial example generation during meta-updating, and partial model training during fine-tuning (only fine-tuning the classifier's head). • We for the first time show that the use of unlabeled data augmentation, particularly introducing an auxiliary contrastive learning task, can provide additional benefits on adversarial robustness of MAML in the low data regime, 2% robust accuracy improvement and 9% clean accuracy improvement over the state-of-the-art robust MAML method (named as adversarial querying) in (Goldblum et al., 2019) .

Related work

To train a standard model (instead of a meta-model), the most effective robust training methods include adversarial training (Madry et al., 2017) , TRADES that places a theoreticallygrounded trade-off between accuracy and robustness (Zhang et al., 2019b) , and their many variants such as fast adversarial training methods (Shafahi et al., 2019; Zhang et al., 2019a; Wong et al., 2020; Andriushchenko & Flammarion, 2020) , semi-supervised robust training (Carmon et al., 2019; Stanforth et al., 2019) , adversarial transfer learning and certifiably robust training (Wong & Kolter, 2017; Dvijotham et al., 2018) . Moreover, recent works (Hendrycks et al., 2019; Chen et al., 2020a; Shafahi et al., 2020; Chan et al., 2020; Utrera et al., 2020; Salman et al., 2020) studied the transferability of robustness in in the context of transfer learning and representation learning. However, the aforementioned standard robust training methods are not directly applicable to MAML in few-shot learning considering MAML's bi-leveled optimization nature. A few recent works studied the problem of adversarial training in the context of MAML (Goldblum et al., 2019; Yin et al., 2018) . Yin et al. (2018) considered the robust training in both fine-tuning and meta-update steps, which is unavoidably computationally expensive and difficult in optimization. The most relevant work to ours is (Goldblum et al., 2019) , which proposed adversarial querying (AQ) by integrating adversarial training with MAML. Similar to ours, AQ attempted to robustify meta-update only to gain sufficient robustness. However, it lacks explanation for the rationale behind that. We will show that AQ can also be regarded as a special case of our proposed robustnesspromoting MAML framework. Most important, we make a more in-depth study with novelties summarized in Contributions. Another line of research relevant to ours is efficient MAML, e.g., (Raghu et al., 2019; Song et al., 2019; Su et al., 2019) , where the goal is to improve the computation efficiency and/or the generalization of MAML. In (Song et al., 2019) , gradient-free optimization was leveraged to alleviate the need of second-order derivative information during meta-update. In (Raghu et al., 2019) , MAML was simplified by removing the fine-tuning step over the representation block of a meta-model. It was shown that such a simplification is surprisingly effective without losing generalization-ability. In (Su et al., 2019) , a self-supervised representation learning task was augmented to the meta-updating objective and resulted in a meta-model with improved generalization. Although useful insights were gained from MAML in the aforementioned works, none of them took adversarial robustness into account.

2. PRELIMINARIES AND PROBLEM STATEMENT

In this section, we first review model-agnostic meta learning (MAML) (Finn et al., 2017) and adversarial training (Madry et al., 2017) , respectively. We then motivate the setup of robustness-promoting MAML and demonstrate its challenges in design when integrating MAML with robust regularization. MAML MAML attempts to learn an initialization of model parameters (namely, a meta-model) so that a new few-shot task can quickly and easily be tackled by fine-tuning this meta-model over a small amount of labeled data. The characteristic signature of MAML is its bi-level learning procedure, where the fine-tuning stage forms a task-specific inner loop while the meta-model is updated at the outer loop by minimizing the validation error of fine-tuned models over cumulative tasks. Formally, consider N few-shot learning tasks {T i } N i=1 , each of which has a fine-tuning data set D i and a validation set D i , where D i is used in the fine-tuning stage and D i is used in the meta-update stage. Here the superscript ( ) is preserved to indicate operations/parameters at the meta-upate stage. MAML is then formulated as the following bi-level optimization problem (Finn et al., 2017) : minimize w 1 N N i=1 i (w i ; D i ) subject to w i = arg min wi i (w i ; D i , w), ∀i ∈ [N ] (1) where w denotes the meta-model to be designed, w i is the T i -specific fine-tuned model, i (w i ; D i ) represents the validation error using the fine-tuned model, i (w i ; D i , w) denotes the training error when fine-tuning the task-specific model parameters w i using the task-agnostic initialization w, and for ease of notation, [K] represents the integer set {1, 2, . . . , K}. In (1), the objective function and the constraint correspond to the meta-update stage and fine-tuning stage, respectively. The bilevel optimization problem is challenging because each constraint calls an inner optimization oracle, which is typically instantiated into a K-step gradient descent (GD) based solver: w (k) i = w (k-1) i -α∇ wi i (w (k-1) i ; D i , w), k ∈ [K], with w (0) i = w. We note that even with the above simplified fine-tuning step, updating the meta-model w still requires the second-order derivatives of the objective function of (1) with respect to (w.r.t.) w.

Adversarial training

The min-max optimization based adversarial training (AT) is known as one of the most powerful defense methods to obtain a robust model against adversarial attacks (Madry et al., 2017) . We summarize AT and its variants through the following robustness-regularized optimization problem: minimize w λE (x,y)∈D [ (w; x, y)] + E (x,y)∈D [maximize δ ∞≤ g(w; x + δ, y)] R(w; D) , where (w; x, y) denotes the prediction loss evaluated at the point x with label y, λ ≥ 0 is a regularization parameter, δ denotes the input perturbation variable within the ∞ -norm ball of radius , g represents the robust loss evaluated at the model w at the perturbed example x + δ given the true label y, and for ease of notation, let R(w; D) denote the robust regularization function for model w under the data set D. In the rest of the paper, we consider two specifications of R: (a) AT regularization (Madry et al., 2017) , where we set g = and λ = 0; (b) TRADES regularization (Zhang et al., 2019b) , where we define g as the cross-entropy between the distribution of prediction probabilities at the perturbed example (x + δ) and that at the original sample x. Robustness-promoting MAML Integrating MAML with AT is a natural solution to enhance adversarial robustness of a meta-model in few-shot learning. However, this seemingly simple scheme is in fact far from trivial, and there exist three critical roadblocks as elaborated below. First, it remains elusive at which stage (fine-tuning or meta-update) robustness can most effectively be gained for MAML. Based on (1) and ( 2), we can cast this problem as a unified optimization problem that augments the MAML loss with the robust regularization under two degrees of freedom characterized by two hyper-parameters γ out ≥ 0 and γ in ≥ 0: minimize w 1 N N i=1 [ i (w i ; D i ) + γ out R i (w i ; D i )] subject to w i = arg min wi [ i (w i ; D i , w) + γ in R i (w i ; D i )], ∀i ∈ [N ]. Here R i denotes the task-specific robustness regularizer, and the choice of (γ in , γ out ) determines the specific scenario of robustness-promoting MAML. Clearly, the direct application is to set γ in > 0 and γ out > 0, that is, both fine-tuning and meta-update steps would be carried out using robust training, which calls additional loops to generate adversarial examples. Thus, this would make computation most intensive. Spurred by that, we ask: Is it possible to achieve a robust meta-model by incorporating robust regularization into only either meta-update or fine-tuning step (corresponding to γ in = 0 or γ out = 0)? Second, both MAML in (1) and AT in ( 2) are challenging bi-level optimization problems which need to call inner optimization routines for fine-tuning and attack generation, respectively. Thus, we ask whether or not the computationally-light alternatives of inner solvers, e.g., partial fine-tuning (Raghu et al., 2019) and fast attack generation (Wong et al., 2020) , can promise adversarial robustness in few-shot learning. Third, it has been shown that adversarial robustness can benefit from semi-supervised learning by leveraging (unlabeled) data augmentation (Carmon et al., 2019; Stanforth et al., 2019) . Spurred by that, we further ask: Is it possible to generalize robustness-promoting MAML to the setup of semi-supervised learning for improved accuracy-robustness tradeoff? In this section, we evaluate at which stage adversarial robustness can be gained during meta-training. We will provide insights and step-by-step investigations to show when to incorporate robust training in MAML and why it works. Based on (3), we focus on two robustness-promoting meta-training protocols. (a) R-MAML both , where robustness regularization applied to both fine-tuning and meta-update steps with γ in , γ out > 0; (b) R-MAML out , where robust regularization applied to meta-update only, i.e., γ in = 0 and γ out > 0. Compared to R-MAML both , R-MAML out is more user-friendly since it allows the use of standard fine-tuning over the learnt robust meta-model when tackling unseen few-shot test tasks (known as meta-testing). In what follows, we will show that even if R-MAML out does not use robust regularization in fine-tuning, it is sufficient to warrant the transferability of meta-model's robustness to downstream fine-tuning tasks. All you need is robust meta-update during metatraining To study this claim, we solve problem (3) using R-MAML both and R-MAML out respectively in the 5-way 1-shot learning setup, where 1 data sample at each of 5 randomly selected MiniImagenet classes (Ravi & Larochelle, 2016 ) constructs a learning task. Throughout this section, we specify R i in (3) as the AT regularization, which calls a 10-step projected gradient descent (PGD) attack generation method with = 2/255 in its inner maximization subroutine given by (2). We refer readers to Section 6 for more implementation details. We find that the meta-model acquired by R-MAML out yields nearly the same robust accuracy (RA) as R-MAML both against various PGD attacks generated at the testing phase using different perturbation sizes = {0, 2, . . . , 10}/255 as shown in Figure 1 . Unless specified otherwise, we evaluate the performance of the meta-learning schemes over 2400 random unseen 5-way 1-shot test tasks. We also note that RA under = 0 becomes the standard accuracy (SA) evaluated using benign (unperturbed) test examples. It is clear from Figure 1 that both R-MAML out and R-MAML both can yield significantly better RA than MAML with slightly worse SA. It is also expected that RA decreases as the attack power increases.

Seed Images

IAMs (MAML) Spurred by experiment results in Figure 1 , we hypothesize that the promotion of robustness in meta-update alone (i.e. R-MAML out ) is already sufficient to offer robust representation, over which fine-tuned models can preserve robustness to downstream tasks. In what follows, we justify the above hypothesis from two perspectives: (i) explanation of learned neuron's representation and (ii) resilience of learnt robust meta-model to different fine-tuning schemes at the metatesting phase. IAMs (R-MAMLboth) IAMs (R-MAMLout) (i) Learned signature of neuron's representation It is recently shown in (Engstrom et al., 2019 ) that a robust model exhibits perceptually-aligned neuron activation maps, which are not present if the model lacks adversarial robustness. To uncover such a signature of robustness, a feature inversion technique (Engstrom et al., 2019) is applied to finding an inverted input attribution map (IAM) that maximizes neuron's activation. Based on that, we examine if R-MAML both and R-MAML out can similarly generate explainable inverted images from the learned neuron's representation. We refer readers to Appendix 2 for more details on feature inversion from neuron's activation. In our experiment, we indeed find that both R-MAML both and R-MAML out yield similar IAMs inverted from neuron's activation at different input examples, as plotted in Figure 2 . More intriguingly, the learnt IAMs characterize the contour of objects existed in input images, and accompanied by the learnt high-level features, e.g., colors. In contrast, the IAMs of MAML lack such an interpretability. The observations from the interpretability of neurons' representation justify why R-MAML out is as effective as R-MAML both and why MAML does not preserve robustness. (ii) Robust meta-update provides robustness adaptation without additional adversarial fine-tuning at meta-testing Meta-testing includes only the fine-tuning stage. Therefore, we need to explore if standard finetuning is enough to maintain the robustness. Suppose that R-MAML out is adopted as the meta-training method to solve problem (3), we then ask if robustness-regularized meta-testing strategy can improve the robustness of finetuned model at downstream tasks. Surprisingly, we find that making an additional effort to adversarially fine-tune the meta-model (trained by R-MAML out ) during testing does not provide an obvious robustness improvement over the standard fine-tuning scheme during testing (Table 1 ). This consistently implies that robust meta-update (R-MAML out ) is sufficient to render intrinsic robustness in its learnt meta-model regardless of fine-tuning strategies used at meta-testing. Figure S1 in Appendix 3 provides evidence that the visualization difference is small between before standard fine-tuning and after standard fine-tuning. Adversarial querying (AQ) (Goldblum et al., 2019) : A special case of R-MAML out The recent work (Goldblum et al., 2019) developed AQ to improve adversarial robustness in few-shot learning. AQ can be regarded as a special case of R-MAML out with γ in = 0 but setting γ out = ∞ in (3). That is, the meta-update is overridden by the AT regularization. We find that AQ yields about 2% RA improvement over R-MAML out , which uses γ out = 0.2 in (3). However, AQ leads to 11% degradation in SA, and thus makes a much poorer robustness-accuracy tradeoff than our proposed R-MAML out . We refer readers to Table 2 for comparison of the proposed R-MAML out with other training baselines. Most importantly, different from (Goldblum et al., 2019) , we provide insights on why R-MAML out is effective in promoting adversarial robustness from meta-update to fine-tuning.

4. COMPUTATIONALLY-EFFICIENT ROBUSTNESS-REGULARIZED MAML

In this section, we study if the proposed R-MAML out can further be improved to ease of optimization given the two computation difficulties in (3): (a) bi-leveled meta-learning, and (b) the need of inner maximization to find the worst-case robust regularization. To tackle either problem alone, there have been efficient solution methods proposed recently. In (Raghu et al., 2019) , an almostno-inner-loop (ANIL) fine-tuning strategy was proposed, where fine-tuning is only applied to the task-specific classification head following a frozen representation network inherited from the metamodel. Moreover, in (Wong et al., 2020) , a fast gradient sign method (FGSM) based attack generator was leveraged to improve the efficiency of AT without losing its adversarial robustness. Motivated by (Raghu et al., 2019; Wong et al., 2020) , we ask if integrating R-MAML out with ANIL and/or FGSM can improve the training efficiency but preserves the robustness and generalization-ability of a meta-model learnt from R-MAML out . R-MAML out meets ANIL and FGSM We decompose the meta-model w = [w r , w c ] into two parts: representation encoding network w r and classification head w c . In R-MAML out , namely, (3) with γ in = 0, ANIL suggests to only fine-tune w c over a specific task T i . This leads to w c,i = arg min wc,i i (w c,i , w r ; D i , w), with w r,i = w r . (ANIL) In ANIL, the initialized representation network w r keeps intact during task-specific fine-tuning, which thus saves the computation cost. Furthermore, if FGSM is used in R-MAML out , then the robustness regularizer R defined in (2) reduces to R(w; D) = E (x,y)∈D [g(w; x + δ * (x), y)], δ * (x) = δ 0 + ∇ x g(w; x, y), (FGSM) where δ 0 is an initial point randomly drawn from a uniform distribution over the interval [-, ]. Note that in the original implementation of robust regularization R, a multi-step projected gradient ascent (PGA) is typically used to optimize the sample-wise adversarial perturbation δ(w). By contrast, FGSM only uses one-step PGA in attack generation and thus improves the computation efficiency. In Table 2 , we study two computationally-light alternatives of R-MAML out , R-MAML out with ANIL (R-MAML out -ANIL) and R-MAML out with FGSM (R-MAML out -FGSM). Compared to R-MAML out , we find that although R-MAML out -FGSM takes less computation time, it yields even better RA with slightly worse SA. By contrast, R-MAML out -ANIL yields the least computation cost but the worst SA and RA. For comparison, we also present the performance of the adversarial meta-learning baseline AQ (Goldblum et al., 2019) . As we can see, AQ promotes the adversarial robustness at the cost of a significant SA drop, e.g., 7.56% worse than R-MAML out -ANIL. Overall, the application of FGSM to R-MAML out provides the most graceful tradeoff between the computation cost and the standard and robust accuracies. In the rest of the paper, unless specified otherwise we will use FGSM in R-MAML out .

5. SEMI-SUPERVISED ROBUSTNESS-PROMOTING MAML

Given our previous solutions to when (Sec. 3) and how (Sec. 4) a robust regularization could effectively be promoted in few-shot learning, we next ask: Is it possible to further improve our proposal R-MAML out by leveraging unlabeled data? Such a question is motivated from two aspects. First, the use of unlabeled data augmentation could be a key momentum to improve the robustnessaccuracy tradeoff (Carmon et al., 2019; Stanforth et al., 2019) . Second, the recent success in selfsupervised contrastive representation learning (Chen et al., 2020b; He et al., 2020) demonstrates the power of multi-view (unlabeled) data augmentation to acquire discriminative and generalizable visual representations, which can guide down-stream supervised learning. In what follows, we propose an extension of R-MAML out applicable to semi-supervised learning with unlabeled data augmentation. R-MAML out with TRADES regularization. We recall from (2) that the robust regularization R can also be specified by TRADES (Zhang et al., 2019b) , which relies only on the prediction logits of benign and adversarial examples (rather than the training label), and thus lends itself to the application of unlabeled data. Spurred by that, we propose R-MAML out -TRADES, which is a variant of R-MAML out using the unlabeled data augmented TRADES regularization. To perform data augmentation in experiments, we follow (Carmon et al., 2019) to mine additional (unlabeled) data with the same amount of MiniImagenet data from the original ImageNet data set. For clarity, we call R-MAML out using TRADES or AT regularization (but without unlabeled data augmentation) R-MAML out (TRADES) or R-MAML out (AT). We find that with the help of unlabeled data, R-MAML out -TRADES improves the accuracyrobustness tradeoff over its supervised counterpart R-MAML out using either AT or TRADES regularization (Figure 3 ). Compared to R-MAML out , R-MAML out -TRADES yields consistently better RA against different attack strength ∈ {2, . . . , 10}/255 during testing. Interestingly, the improvement becomes more significant as increases. As = 0, RA is equivalent to SA, and we observe that the superior performance of R-MAML out -TRADES in RA bears a slight degradation in SA compared to R-MAML out (TRADES) and R-MAML out (AT), which indicates the robustness-accuracy tradeoff. Figure S2 in Appendix 5 provides an additional evidence that R-MAML out -TRADES has the ability to defend stronger attacks than R-MAML out , and proper unlabeled data augmentation can further improve the accuracy-robustness tradeoff in MAML.

R-MAML out with contrastive learning (CL).

To improve adversarial robustness, many works, e.g., (Pang et al., 2019; Sankaranarayanan et al., 2017) , also suggest that it is important to encourage robust semantic features that locally cluster according to class, namely, ensuring that features of samples in the same class will lie close to each other and away from those of different classes. The above suggestion aligns with the goals of contrastive learning (CL) (Chen et al., 2020b; Wang & Isola, 2020) , which promotes (a) alignment (closeness) of features from positive data pairs, and (b) uniformity of feature distribution. Thus, we develop R-MAML out -CL by integrating R-MAML out with CL. Prior to defining R-MAML out -CL, we first introduce CL and refer readers to (Chen et al., 2020b) for details. Given a data sample x, CL utilizes its positive counterpart x + given by a certain data transformation t, e.g., cropping and resizing, cut-out, and rotation, x + = t(x). The data pair (x, t(x )) is then positive if x = x , and negative otherwise. The contrastive loss is defined by CL(wc; p + ) = E (x,x + )∼p + -log e r(x;wc) T r(x + ;wc)/τ e r(x;wc) T r(x + ;wc)/τ + x -∼p,(x,x -) / ∈p + e r(x;wc) T r(x -;wc)/τ , where x ∼ p denotes the data distribution, p + (•, •) is the distribution of positive pairs, r(x; w c ) is the encoded representation of x extracted from the representation network w c , and τ > 0 is a temperature parameter. The contrastive loss minimizes the distance of a positive pair among many negative pairs, namely, learns network representation with instance-wise discriminative power. According to CL, we then augment the data used to train R-MAML out with their transformed counterparts. In addition, the adversarial examples generated during robust regularization can also be used as additional views of the original data, which in turn advance CL. Formally, we modify R-MAML out , given by (3) with γ in = 0, as minimize w 1 N N i=1 i (w i ; D i ) + γ out R i (w i ; D i ) + γ CL CL (w c,i ; p + i ∪ p adv i ) subject to w i = arg min wi i (w i ; D i , w), ∀i ∈ [N ], where γ CL > 0 is a regularization parameter associated with the contrastive loss, p + i ∪ p adv i represents the distribution of positive data pairs constructed by the standard and adversarial views of D , and w c,i denotes the representation block of the model w i . In Table 3 , we compare the SA/RA performance of R-MAML out -CL with that of previously-suggested 3 variants of R-MAML out including the versions R-MAML out (AT) and R-MAML out (TRADES) without using unlabeled data, and the version with unlabeled data R-MAML out -TRADES, as well as 2 baseline methods including standard MAML and adversarial querying (AQ) in few-shot learning (Goldblum et al., 2019) . Note that we specify R i in (4) as TRADES regularization for R-MAML out -CL. We find that R-MAML out -CL yields the best RA among all meta-learning methods, and improves SA over R-MAML out -TRADES. In particular, the comparison with AQ shows that R-MAML out -CL leads to 9% improvement in SA and 1.9% improvement in RA. Key facts of our implementation.

6. ADDITIONAL EXPERIMENTS

In the previous analysis, we consider 1-shot 5-way image classification tasks over MiniImageNet (Vinyals et al., 2016) . And we use a four-layer convolutional neural network for few-shot learning (FSL). By default, we set the training attack strength = 2, γ CL = 0.1, and set γ out = 5 (TRADES), γ out = 0.2 (AT) via a grid search. During meta-testing, a 10-step PGD attack with attack strength = 2 is used to evaluate RA of the learnt meta-model over 2400 few-shot test tasks. We provide experiment details in Appendix 4. Summary of baselines. We remark that in addition to MAML and AQ baselines, we also consider the other two baseline methods, supervised standard training over the entire dataset (non-FSL setting), and supervised AT over the entire dataset (non-FSL setting); see a summary in Table 4 . The additional baselines demonstrate that robust adaptation in FSL is non-trivial as neither the supervised full AT or the full standard training can achieve satisfactory SA and RA. Experiments on Additional model architecture, datasets and FSL setups. In Table S1 of Appendix 5, we provide additional experiments using ResNet18. In particular, R-MAML out -CL leads to 13.94% SA improvement and 1.42% RA improvement over AQ. We also test our methods on CIFAR-FS (Bertinetto et al., 2018) and Omniglot (Lake et al., 2015) , and provide the results in Table 5 and Figure S3 , respectively (more details can be viewed in Appendix 6 and Appendix 7). The results show that our methods perform well on various datasets and outperform the baseline methods. On CIFAR-FS, we study 1-Shot 5-Way and 5-Shot 5-Way settings. As shown in Table 5 , the use of unlabeled data augmentation (R-MAML out -CL) on CIFAR-FS can provide 10% (or 5.6%) SA improvement and 3% (or 1.3%) RA improvement over AQ under the 1-Shot 5-Way (or 5-Shot 5-Way) setting. Furthermore, we conduct experiments in other FSL setups. On Omniglot, we compare R-MAML out (TRADES) to AQ (Goldblum et al., 2019) in the 1-shot (5, 10, 15, 20)-Way settings. Figure S3 shows that R-MAML out (TRADES) can always obtain better performance than AQ when the number of classes in each task varies.

7. CONCLUSION

In this paper, we study the problem of adversarial robustness in MAML. Beyond directly integrating MAML with robust training, we show and explain when a robust regularization should be promoted in MAML. We find that robustifying the meta-update stage via fast attack generation method is sufficient to achieve fast robustness adaptation without losing generalization and computation efficiency in general. To further improve our proposal, we for the first time study how unlabeled data help robust MAML. In particular, we propose using contrastive representation learning to acquire improved generalization and robustness simultaneously. Extensive experiments are provided to demonstrate the effectiveness of our approach and justify our insights on the adversarial robustness of MAML. In the future, we plan to establish the convergence rate analysis of robustness-aware MAML by leveraging bi-level and min-max optimization theories.

5. ADDITIONAL COMPARISONS ON MINIIMAGENET

Figure S2 shows robust accuracy (RA) performance of models trained using our methods. One can see that R-MAML out -TRADES has the ability to defend stronger attacks than R-MAML out . In Table S1 , we compare the SA/RA performance of variants of R-MAML out including R-MAML out (AT), the TRADES regularization with unlabeled data R-MAML out -TRADES, the version with contrastive learning R-MAML out -CL. One can see that R-MAML out -CL yields the best SA and RA among all meta-learning methods. 

6. EXPERIMENTS ON CIFAR-FS

We also test our proposed methods on CIFAR-FS (Bertinetto et al., 2018) , which is an image classification dataset containing 64 classes of training data and 20 classes of evaluation data. The compared methods are the same as in Table 3 . We keep the settings to be the same as in the test on MiniImagenet except we set = 8. To perform data augmentation in experiments, we mine 500 additional unlabeled data for each training class from the STL-10 dataset (Coates et al., 2011) . Table S2 and Table S3 show the comparisons in 1-Shot 5-Way and 5-Shot 5-Way learning scenarios, respectively. One can see that our methods outperform the baseline methods MAML and AQ (Goldblum et al., 2019) . The results also indicate that semi-supervised learning (in terms of TRADES and contrastive learning) can further boost the performance. In particular, as shown by Table S2 and Table S3 , R-MAML out -CL leads to 10% SA improvement and 3% RA improvement compared to AQ under the MAML 1-Shot 5-Way setting, and 5.6% SA improvement and 1.3% RA improvement under the 5-Shot 5-Way setting. Table S2 : SA/RA performance of our proposed methods on CIFAR-FS (Bertinetto et al., 2018) (1-Shot 5-Way). SA RA MAML 51.07% 0.235% AQ (Goldblum et al., 2019) 31 S3 : SA/RA performance of our proposed methods on CIFAR-FS (Bertinetto et al., 2018) (5-Shot 5-Way). SA RA MAML 67.2% 0.225% AQ (Goldblum et al., 2019) 52.32% 33.96% R-MAML out (AT) (ours) 57.18% 32.62% R-MAML out (TRADES) (ours) 57.46% 34.72% R-MAML out -TRADES (ours) 57.62% 34.76% R-MAML out -CL (ours) 57.95% 35.30% data. Due to the hardness of finding the unlabeled data with similar patterns, we only test our supervised learning methods on Omniglot. We compare R-MAML out (TRADES) to AQ (Goldblum et al., 2019) in the 1-shot (5, 10, 15, 20)-Way settings. 



Figure 1: RA of meta-models trained by standard MAML, R-MAML both and R-MAMLout versus PGD attacks of different perturbation sizes during meta-testing. Results show that robustness regularized meta-update with standard fine-tuning (namely, R-MAMLout) has already been effective in promotion of robustness.

Figure 2: Visualization of a randomly selected neuron's inverted input attribution maps (IAMs) under different meta-models. The first row shows the seed images. The second-fourth rows show IAMs corresponding to models trained by MAML, R-MAML both , and R-MAMLout, respectively. Except MAML, R-MAML both and R-MAMLout all catch high-level features from the data.

Figure 3: RA versus (testing-phase) PGD attacks at different values of perturbation strength . Here the robust models are trained by different variants of R-MAMLout, including R-MAMLout-TRADES (with unlabeled data augmentation), R-MAMLout using AT regularization but no data augmentation (R-MAMLout(AT)), and R-MAMLout using TRADES regularization but no data augmentation (R-MAMLout(TRADES)).

Figure S3: Performance of R-MAML out (TRADES) and AQ (Goldblum et al., 2019) on Omniglot versus number of classes in each task (from 5 to 20 ways): (a) RA. (b) SA.

Comparison of different strate-

Performance of computation-efficient alternatives of

SA/RA performance of R-MAMLout-CL versus other variants of proposed R-MAMLout and baselines.

Summary of baseline performance in SA and TA.

SA/RA performance of our proposed methods on CIFAR-FS(Bertinetto et al., 2018).

SA/RA performance of different variants of proposed R-MAML out under the 1-shot 5-way scenario on ResNet18.

ACKNOWLEDGEMENT

This work was supported by the Rensselaer-IBM AI Research Collaboration (http://airc. rpi.edu), part of the IBM AI Horizons Network (http://ibm.biz/AIHorizons).

SUPPLEMENTARY MATERIAL 1 FRAMEWORK OF R-MAML out

Algorithm S1 shows the framework of R-MAML out . The initial inputs include model weights w, distribution of the training tasks p(T ), and the step sizes α, β 1 , β 2 , which correspond to fine-tuning, clean meta-update, adversarial meta-update. Each batch contains multiple tasks that are sampled from the p(T ). K is the number of gradient updates in fine-tuning. The adapted parameter w (K) i is used to generate adversarial validation data D i from the clean validation data D i and to compute the loss value R i (wThe attack generator can be selected from Projected Gradient Descent (Madry et al., 2017) , Fast Gradient Sign Method (Goodfellow et al., 2014), etc. Here is used to control the attack strength in the training.Algorithm S1 R-MAML out Input: The initialization weights w; Distribution over tasks p(T ); Step size parameters α, β 1 , β 2 .1 while not done do 2 Sample batch of tasks T i ∼ p(T ) and separate data inUsing attack generator to generate adversarial validation data D i by maximizing adversarial loss R i (w; D i ) 11 end while 12 Return: w

2. DETAILS OF LEARNED SIGNATURE OF NEURON'S ACTIVATION

By maximizing a single coordinate of the neuron activation vector r (the output before the fullyconnected layer) with a perturbation in the input, the perturbation will show different behaviors between a robust model and a standard model (Engstrom et al., 2019) . To be more specific, the feature pattern is revealed in the input under a robust model, while a standard model does not have such behavior. The optimization problem can be mathematically written in the following form maximizewhere r i denotes the i-th coordinate of neuron activation vector. δ is the perturbation in the input.x j is the j-th pixel of the image vector x.

3. VISUALIZATION OF IAMS BEFORE AND AFTER FINE-TUNING IN META-TESTING

Once obtain a model using R-MAML out , we can test the impact of the standard fine-tuning on its robustness. Figure S1 shows a randomly selected neuron's inverted input attribution maps (IAMs) before standard fine-tuning and after standard fine-tuning in the meta-testing phase. The second row shows IAMs of the model before fine-tuning. The third row shows IAMs of the model after fine-tuning. One can find that the difference is small between the IAMs before fine-tuning and after fine-tuning, suggests that robust meta-update itself can provide the robustness adaptation without additional adversarial training. 

4. DETAILS OF EXPERIMENTS

To test the effectiveness of our methods, we employ the MiniImageNet dataset Vinyals et al. (2016) , which is the benchmark for few-shot learning. MiniImageNet contains 100 classes with 600 samples in each class. We use the training set with 64 classes and test set with 20 classes. In our experiments, we downsize each image to 84 × 84 × 3.we consider the 1-shot 5-way image classification task, i.e., the inner-gradient update (fine-tuning) is implemented using five classes and one fine-tuning image for each class in one single task. In meta-training, Each batch contains four tasks. We set the number of gradient update steps K = 5 in meta-training. For the meta-update, we use 15 validation images for each class. We set the gradient step size in the fine-tuning as α = 0.01, and the gradient step sizes in the meta-update as β 1 = 0.001, β 2 = 0.001 for clean validation data and adversarial validation data, respectively. 

