

Abstract

Meta-learning enables a model to learn from very limited data to undertake a new task. In this paper, we study the general meta-learning with adversarial samples. We present a meta-learning algorithm, ADML (ADversarial Meta-Learner), which leverages clean and adversarial samples to optimize the initialization of a learning model in an adversarial manner. ADML leads to the following desirable properties: 1) it turns out to be very effective even in the cases with only clean samples; 2) it is robust to adversarial samples, i.e., unlike other meta-learning algorithms, it only leads to a minor performance degradation when there are adversarial samples; 3) it sheds light on tackling the cases with limited and even contaminated samples. It has been shown by extensive experimental results that ADML outperforms several representative meta-learning algorithms in the cases involving adversarial samples generated by different attack mechanisms, on two widely-used image datasets, MiniImageNet and CIFAR100, in terms of both accuracy and robustness.

1. INTRODUCTION

Deep learning has made tremendous successes and emerged as a de facto approach in many application domains, such as computer vision and natural language processing, which, however, depends heavily on huge amounts of labeled training data. The goal of meta-learning is to enable a model (especially a Deep Neural Network (DNN)) to learn from only a small number of data samples to undertake a new task, which is critically important to machine intelligence but turns out to be very challenging. Currently, a common approach to learn is to train a model to undertake a task from scratch without making use of any previous experience. Specifically, a model is initialized randomly and then updated slowly using gradient descent with a large number of training samples. This kind of time-consuming and data-hungry training process is quite different from the way how a human learns quickly from only a few samples and obviously cannot meet the requirement of meta-learning. Several methods (Finn et al. ( 2017 2017)) presents a novel meta-learning algorithm called MAML (Model-Agnostic Meta-Learning), which trains and optimizes the initialization of model parameters carefully such that it achieves the maximal performance on a new task after its parameters are updated through one or just a few gradient steps with a small amount of data. This method is claimed to be model-agnostic since it can be directly applied to any learning model that can be trained with gradient descent. Robustness is another major concern for machine intelligence, especially for the safety-critical applications, such as facial recognition, algorithmic trading and copyright control. It has been shown that such learning models can be easily fooled by adversarial manipulation to cause serious security threats (Zhao et al. ( 2018 2018))) are also vulnerable to adversarial samples, i.e., adversarial samples can lead to a significant performance degradation



); Vinyals et al. (2016); Snell et al. (2017); Sung et al. (2018)) have been proposed to address meta-learning by fixing the above issue. For example, a well-known work (Finn et al. (

); Goldblum et al. (2020); Saadatpanah et al. (2019)), which, however, can be properly and effectively handled by conventional adversarial training and pre-processing defenses (Madry et al. (2017); Zhang et al. (2019); Samangouei et al. (2018)). Nonetheless, if the data is limited (e.g., face recognition from few images), the aforementioned pipelines, which require a large amount of training data, suffers from serious performance degradation. Although meta-learning based approaches show great potential on dealing with few-shot tasks, we show via experiments that existing meta-leaning algorithms (such as MAML (Finn et al. (2017)), Matching Networks (Vinyals et al. (2016)) and Relation Networks (Sung et al. (

