FP AINET: FUSION PROTOTYPE WITH ADAPTIVE INDUCTION NETWORK FOR FEW-SHOT LEARNING Anonymous authors Paper under double-blind review

Abstract

A prototypical network treats all samples equally and does not consider the noisy samples, which leads to a biased class representation. In this paper, we propose a novel fusion prototype with an adaptive induction network (FP AINet) for fewshot learning that can learn representative prototypes from a few support samples. Specifically, to address the problem of noisy samples, an adaptive induction network is developed, which can learn different class representations for queries and assign adaptive scores for support samples according to their relative significance. Moreover, FP AINet can generate a more accurate prototype than comparison methods by considering the query-related samples. With an increasing number of samples, the prototypical network is more expressive since the adaptive induction network ignores the relative local features. As a result, a Gaussian fusion algorithm is designed to learn more representative prototypes. Extensive experiments are conducted on three datasets: miniImageNet, tieredImageNet, and CIFAR FS. The experimental results compared with the state-of-the-art few-shot learning methods demonstrate the superiority of FP AINet.

1. INTRODUCTION

Few-shot learning aims to learn classifiers for novel classes with limited data. Prototypical network (PN) (Snell et al. (2017) ) averages the support features as the prototype. While most of the previous research has achieved promising results, those methods generally assume that the samples used for training were carefully selected to represent their class. The expected prototype should have the smallest distance from all other samples in its class (Liu et al. (2020) ), and each sample significantly contributes to the final performance when training from a few labeled samples. Unfortunately, the existing dataset frequently contains mislabeled samples because of weakly automated supervised annotation, ambiguity, or human error (Liang et al. (2022) ). In addition, since some images have multiple objects and unrelated background information, the accuracy can be affected by a single noisy example. As illustrated in Figure 1 2018)). Optimization-based methods readily learn the model's parameters to adapt to each task using gradient descent. However, these methods need to be fine-tuned for the target tasks. Metric-based methods are more efficient and applicable than optimization-based methods. Metricbased methods learn a good metric to calculate the similarity between query and the support samples using a pre-defined distance function, such as cosine similarity (Vinyals et al. ( 2016 2021)), but since it is easy to introduce sample noise or class differences, a novel method of fusion prototype with an adaptive induction network (FP AINet) is proposed to solve the issue. The induction network (Geng et al. ( 2019)) designs a non-linear mapping from sample vector to class vector to diminish the prototype bias. But since the model has not seen query samples before extracting support features, some inappropriate features may be extracted, resulting in a significant deviation in prototype estimation. An adaptive induction network (AINet) is proposed to extract more reliable prototypes for each class. The AINet does not take into account the local relative importance of different regions in a sample, while the prototype generated by the PN becomes more discriminative and expressive as the number of support samples increases, as shown in Figure 1 (b). To solve the problem that the calculation of a single prototype is not comprehensive, we assume the estimated prototype follow a multivariate Gaussian distribution (Zhang et al. ( 2021)). Specifically, the features in the target task are transformed using the Yeo-Johnson transformation, and then two kinds of prototypes are combined, which are generated by AINet and PN, respectively. Finally, the performance of FP AINet is evaluated on the miniImageNet, tieredImageNet, and CIFAR FS. Besides, the ablation experiments validate the effectiveness of the FP AINet. Experimental results show that the FP AINet can generate a more representative prototype and improve the accuracy of few-shot learning. The main contributions are summarized as follows: (1) A novel method of AINet is proposed to assign scores to support samples based on their relevance automatically. ( 2 2022)) measures the similarity at three different feature levels. According to the above analysis, most existing methods ignore the noisy samples, resulting in biased class representations. To solve this issue, this paper proposes a more accurate prototype estimate method to improve the few-shot image classification performance.



(a), the PN is easily affected by noisy samples. Metalearning approaches have become the dominant paradigm for few-shot learning (Chen et al. (2020); Tian et al. (2020); Yao et al. (2021)). Meta-learning approaches can be roughly summarized into two categories: optimization-based methods (Antoniou et al. (2019); Kao et al. (2022)) and metric-based methods (Vinyals et al. (2016); Sung et al. (

)), euclidean distance (Snell et al. (2017); Koch et al. (2015)), earth mover's distance (Zhang et al. (2020)), or a distance parameterized by a neural network (Sung et al. (2018); Zhang et al. (2018)), which has achieved remarkable success due to its fewer parameters. To obtain more representative prototypes, many methods correct the prototype by using similar samples (Yang et al. (2021); Liu et al. (2020)) or additional knowledge (Zhang et al. (

(a) Prototype with noisy samples. (b) Test accuracy on miniImageNet.

Figure 1: Different prototype models. (a) shows the sample is misclassified by the PN. Different colors represent different classes. The orange circle denotes the sample to be classified. (b) illustrates the test accuracy of different prototypes on the 5-way k-shot.

) A modified Gaussian-based fusion algorithm is employed to aggregates prototypes from PN and AINet by exploring the unlabeled samples. (3) Extensive experiments on three datasets demonstrate the effectiveness of the FP AINet. 2 RELATED WORK Unlike conventional machine learning, which provides abundant training examples, few-shot learning requires a classifier that can quickly adapt to novel classes with limited examples. Many efforts have been made to address the issue of data efficiency. Metric-based methods. To boost the performance of PN, task dependent adaptive metric (TADAM) (Oreshkin et al. (2018)) proposes metric scaling and task conditioning. It is difficult to represent the distribution of a class with limited samples, so many methods have been proposed to correct bias in prototype estimations (Hou & Sato (2021); Yang et al. (2021)). BD-CSPN (Liu et al. (2020)) modifies prototypes by diminishing intra-class and cross-class bias. A pseudo-label is used to reduce intra-class bias, but it is easy to introduce noise. Rather than relying on a pre-defined metric to calculate similarity (Vinyals et al. (2016)), relation network (Sung et al. (2018)) and a deep comparison network (Zhang et al. (2018)) train deep neural networks to compare each query-support image pair. While previous methods adopted the conceptual representation of the first moment (Snell et al. (2017)), CovaMNet (Li et al. (2019)) adopts the second moment rather than the first moment for feature description. Unlike the above methods, multi-level metric learning (Chen et al. (

