ONLINE CONTINUAL LEARNING FOR PROGRESSIVE DISTRIBUTION SHIFT (OCL-PDS): A PRACTITIONER'S PERSPECTIVE

Abstract

We introduce the novel OCL-PDS problem -Online Continual Learning for Progressive Distribution Shift. PDS refers to the subtle, gradual, and continuous distribution shift that widely exists in modern deep learning applications. It is widely observed in industry that PDS can cause significant performance drop. While previous work in continual learning and domain adaptation addresses this problem to some extent, our investigations from the practitioner's perspective reveal flawed assumptions that limit their applicability on daily challenges faced in realworld scenarios, and this work aims to close the gap between academic research and industry. For this new problem, we build 4 new benchmarks from the Wilds dataset (Koh et al., 2021), and implement 12 algorithms and baselines including both supervised and semi-supervised methods, which we test extensively on the new benchmarks. We hope that this work can provide practitioners with tools to better handle realistic PDS, and help scientists design better OCL algorithms.

1. INTRODUCTION

In most modern deep learning applications, the input data undergoes a continual distribution shift over time. For example, consider a satellite image classification task as illustrated in Figure 1a . In this task, the input data distribution changes with time due to the changes in landscape, and camera updates which can lead to higher image resolutions and wider color bands. Similarly, in a toxic language detection task on social media illustrated in Figure 1b , the distribution shift can be caused by a shift in trends and hot topics (many people post about hot topics like BLM (Wikipedia contributors, 2022a) and Roe v. Wade (Wikipedia contributors, 2022b) on social media), or a shift in language use (Röttger & Pierrehumbert, 2021; Luu et al., 2022) . Such distribution shift can cause significant performance drop in deep models, a widely observed phenomenon known as model drift. A critical problem for practitioners, therefore, is how to deal with what we term progressive distribution shift (PDS), defined as the subtle, gradual, and continuous distribution shift that widely exists in modern deep learning applications. In this work, we explore handling PDS with online continual learning (OCL), where the learner collects, learns, and is evaluated on online samples from a continually changing data distribution. In Section 2, we formulate the OCL-PDS problem. The OCL-PDS problem is closely related to two research areas: domain adaptation (DA) and continual learning (CL), in which there is a rich body of academic work. However, through a literature review and our conversations with practitioners, we find that there still remains a gap between the settings widely used in academic work and in real industrial applications. To close this gap, we commit ourselves to thinking from a practitioner's perspective, which is the core spirit of this work. Our primary goal is to build tools for investigating the real issues practitioners are facing in their day-to-day work. To achieve this goal, we challenge the prevailing assumptions in previous work, and propose three important modifications to the conventional DA and CL problem settings: 1. Task-free: One point conventional DA and CL settings have in common is assuming clear boundaries between distinct domains (or tasks), but practitioners rarely apply the same model to very different domains in industry. In contrast, OCL studies the task-free CL setting (Aljundi et al., 2019b) where there is no clear boundary, and the distribution shift is continuous. Moreover, in OCL both training and evaluation are online, unlike previous task-free CL settings with offline evaluation, which is not as realistic in a "lifelong" setting.

