IALE: IMITATING ACTIVE LEARNER ENSEMBLES

Abstract

Active learning (AL) prioritizes the labeling of the most informative data samples. However, the performance of AL heuristics depends on the structure of the underlying classifier model and the data. We propose an imitation learning scheme that imitates the selection of the best expert heuristic at each stage of the AL cycle in a batch-mode pool-based setting. We use DAGGER to train the policy on a dataset and later apply it to datasets from similar domains. With multiple AL heuristics as experts, the policy is able to reflect the choices of the best AL heuristics given the current state of the AL process. Our experiment on well-known datasets show that we both outperform state of the art imitation learners and heuristics.

1. INTRODUCTION

The high performance of deep learning on various tasks from computer vision (Voulodimos et al., 2018) to natural language processing (NLP) (Barrault et al., 2019) also comes with disadvantages. One of their main drawbacks is the large amount of labeled training data they require. Obtaining such data is expensive and time-consuming and often requires domain expertise. Active Learning (AL) is an iterative process where during every iteration an oracle (e.g. a human) is asked to label the most informative unlabeled data sample(s). In pool-based AL all data samples are available (while most of them are unlabeled). In batch-mode pool-based AL, we select unlabeled data samples from the pool in acquisition batches greater than 1. Batch-mode AL decreases the number of AL iterations required and makes it easier for an oracle to label the data samples (Settles, 2009) . As a selection criteria we usually need to quantify how informative a label for a particular sample is. Well-known criteria include heuristics such as model uncertainty (Gal et al., 2017; Roth & Small, 2006; Wang & Shang, 2014; Ash et al., 2020) , data diversity (Sener & Savarese, 2018), query-by-committee (Beluch et al., 2018) , and expected model change (Settles et al., 2008) . As ideally we label the most informative data samples at each iteration, the performance of a machine learning model trained on a labeled subset of the available data selected by an AL strategy is better than that of a model that is trained on a randomly sampled subset of the data. Besides the above mentioned, in the recent past several other data-driven AL approaches emerged. Some are modelling the data distributions (Mahapatra et al., 2018; Sinha et al., 2019; Tonnaer, 2017; Hossain et al., 2018) as a pre-processing step, or similarly use metric-based meta-learning (Ravi & Larochelle, 2018; Contardo et al., 2017) as a clustering algorithm. Others focus on the heuristics and predict the best suitable one using a multi-armed bandits approach (Hsu & Lin, 2015) . Recent approaches that use reinforcement learning (RL) directly learn strategies from data (Woodward & Finn, 2018; Bachman et al., 2017; Fang et al., 2017) . Instead of pre-processing data or dealing with the selection of a suitable heuristic they aim to learn an optimal selection sequence on a given task. However, these pure RL approaches not only require a huge amount of samples they also do not resort to existing knowledge, such as potentially available AL heuristics. Moreover, training the RL agents is usually very time-intensive as they are trained from scratch. Hence, imitation learning (IL) helps in settings where very few labeled training data and a potent algorithmic expert are available. IL aims to train, i.e., clone, a policy to transfer the expert to the related few data problem. While IL mitigates some of the previously mentioned issues of RL, current approaches are still limited with respect to their algorithmic expert and their acquisition size (including that of Liu et al. ( 2018)), i.e., some only pick one sample per iteration, and were so far only evaluated on NLP tasks. We propose an batch-mode AL approach that enables larger acquisition sizes and that allows to make use of a more diverse set of experts from different heuristic families, i.e., uncertainty, diversity,

