PROGRESSIVE VORONOI DIAGRAM SUBDIVISION ENABLES ACCURATE DATA-FREE CLASS-INCREMENTAL LEARNING

Abstract

Data-free Class-incremental Learning (CIL) is a challenging problem because rehearsing data from previous phases is strictly prohibited, causing catastrophic forgetting of Deep Neural Networks (DNNs). In this paper, we present iVoro, a novel framework derived from computational geometry. We found Voronoi Diagram (VD), a classical model for space subdivision, is especially powerful for solving the CIL problem, because VD itself can be constructed favorably in an incremental manner -the newly added sites (classes) will only affect the proximate classes, making the non-contiguous classes hardly forgettable. Furthermore, we bridge DNN and VD using Power Diagram Reduction, and show that the VD structure can be progressively refined along the phases using a divide-and-conquer algorithm. Moreover, our VD construction is not restricted to the deep feature space, but is also applicable to multiple intermediate feature spaces, promoting VD to be multilayer VD that efficiently captures multi-grained features from DNN. Importantly, iVoro is also capable of handling uncertainty-aware test-time Voronoi cell assignment and has exhibited high correlations between geometric uncertainty and predictive accuracy (up to ∼0.9). Putting everything together, iVoro achieves up to 25.26%, 37.09%, and 33.21% improvements on CIFAR-100, TinyImageNet, and ImageNet-Subset, respectively, compared to the state-of-the-art non-exemplar CIL approaches. In conclusion, iVoro enables highly accurate, privacy-preserving, and geometrically interpretable CIL that is particularly useful when cross-phase data sharing is forbidden, e.g. in medical applications.

1. INTRODUCTION

In many real-world applications such as medical imaging-based diagnosis, the learning system is usually required to be expandable to new classes, for example, from common to rare inherited retinal diseases (IRDs) (Miere et al., 2020) , or from coarse to fine chest radiographic findings (Syeda-Mahmood et al., 2020) , and importantly, without losing the knowledge already learned. This motivates the concept of incremental learning (IL) (Hou et al., 2019; Wu et al., 2019; Zhu et al., 2021; Liu et al., 2021b) , also known as continual learning (Parisi et al., 2019; Delange et al., 2021; Chaudhry et al., 2019) , which has drawn growing interest in recent years. Although Deep Neural Networks (DNNs) have become the de facto method of choice due to their extraordinary ability to learn from complex data, they still suffer from severe catastrophic methods (Li & Hoiem, 2017; Schwarz et al., 2018; Castro et al., 2018; Hou et al., 2019; Dhar et al., 2019; Douillard et al., 2020; Zhu et al., 2021) to partially maintain the spatial distribution of old classes. The KD loss, however, is typically applied onto the whole network, and a strong KD loss may potentially degenerate the network's ability to adapt to novel classes. (II) Without the full access to old data, the decision boundaries cannot be learned precisely, making it harder to discriminate between old and new classes. Taking inspiration from metric-based Few-shot Learning (FSL) (Snell et al., 2017) , PASS (Zhu et al., 2021) memorizes a set of prototypes (feature centroids) and generates features augmented by Gaussian noise for a joint training in new phases. However, feature centroids might be suboptimal to represent the whole class, which is not necessarily normally distributed (Fig. 2 (B) ). (III) Since the old classes and the new classes are learned in a disjoint manner, their distributions are likely to be overlapped, which becomes even severer in our exemplar-free setting as the old data is totally absent. To circumvent this issue, Task-incremental learning (TIL) (Shin et al., 2017; Kirkpatrick et al., 2017; Zenke et al., 2017; Wu et al., 2018; Lopez-Paz & Ranzato, 2017; Buzzega et al., 2020; Cha et al., 2021; Pham et al., 2021; Fernando et al., 2017) assumes the phase within which a class was learned is known, which is generally unrealistic in practice. CIL is not grounded on this assumption. In this paper, we tackle the CIL problem from a geometric point of view. Voronoi Diagram (VD) is a classical model for space subdivision and is the underlying geometric structure of the 1-nearest neighbor classifier (Lee, 1982) . We find that VD bears a close analogy to incremental learning, because VD itself can be constructed favorably in an incremental manner -the newly added sites (classes) will roughly change only the cells of the neighboring classes, making the non-contiguous



Figure 1: Schematic illustrations of Voronoi Diagram (VD) for base sites (A), and when a new site (B) or a clique of new sites (C) is added to the system.

Figure 2: Visualization of of Voronoi Diagrams induced by (A) incremental fine-tuning, (B) PASS (Zhu et al., 2021), (C) iVoro, and (D) iVoro-AC on MNIST dataset in R 2 (best viewed in color). The dataset was split to 4, 3, and 3 disjoint classes. (See Appendix B for details.)

