BETTER OPTIMIZATION CAN REDUCE SAMPLE COM-PLEXITY: ACTIVE SEMI-SUPERVISED LEARNING VIA CONVERGENCE RATE CONTROL

Abstract

Reducing the sample complexity associated with deep learning (DL) remains one of the most important problems in both theory and practice since its advent. Semisupervised learning (SSL) tackles this task by leveraging unlabeled instances which are usually more accessible than their labeled counterparts. Active learning (AL) directly seeks to reduce the sample complexity by training a classification network and querying unlabeled instances to be annotated by a human-in-the-loop. Under relatively strict settings, it has been shown that both SSL and AL can theoretically achieve the same performance of fully-supervised learning (SL) using far less labeled samples. While empirical works have shown that SSL can attain this benefit in practice, DL-based AL algorithms have yet to show their success to the extent achieved by SSL. Given the accessible pool of unlabeled instances in pool-based AL, we argue that the annotation efficiency brought by AL algorithms that seek diversity on labeled samples can be improved upon when using SSL as the training scheme. Equipped with a few theoretical insights, we designed an AL algorithm that rather focuses on controlling the convergence rate of a classification network by actively querying instances to improve the rate of convergence upon inclusion to the labeled set. We name this AL scheme convergence rate control (CRC), and our experiments show that a deep neural network trained using a combination of CRC and a recently proposed SSL algorithm can quickly achieve high performance using far less labeled samples than SL. In contrast to a few works combining independently developed AL and SSL (ASSL) algorithms, our method is a natural fit to ASSL, and we hope our work can catalyze research combining AL and SSL as opposed to an exclusion of either.

1. INTRODUCTION

The data-hungry nature of supervised deep learning (DL) algorithms has spurred interest in active learning (AL) , where a model can interact with a dedicated annotator and request unlabeled instances to be labeled. In the pool-based AL setting, a model initially has access to a set of unlabeled samples and can query instances which need be labeled for training. Under certain conditions on the task, AL can provably achieve up to exponential improvement in sample complexity and thus has great potential for reducing the number of labeled instances required to achieve high accuracy. This is especially important when the annotation task is extremely costly, for example, in medical imaging where only highly-specialized experts can diagnose a subject's condition. Active learning algorithms have been extensively explored, with various formulations including uncertainty-based sampling (Wang & Shang, 2014), aligning the labeled and unlabeled distributions (Gissin & Shalev-Shwartz, 2019) with connections to domain adaptation (Ben-David et al., 2010), and coreset (Sener & Savarese, 2018) . Furthermore, there is no standard method in modeling a deep neural network's (DNN) uncertainty, and uncertainty-based AL has its own variants ranging from utilizing Bayesian networks (Kirsch et al., 2019) to using a model's predictive confidence (Wang & Shang, 2014) . This ambiguous characterization of how much information a sample's label carries also motivated AL algorithms based on maximizing the expected change of a classification model (Huang et al., 2016; Ash et al., 2020) .

