CLASS BALANCING GAN WITH A CLASSIFIER IN THE LOOP Anonymous

Abstract

Generative Adversarial Networks (GANs) have swiftly evolved to imitate increasingly complex image distributions. However, majority of the developments focus on performance of GANs on balanced datasets. We find that the existing GANs and their training regimes which work well on balanced datasets fail to be effective in case of imbalanced (i.e. long-tailed) datasets. In this work we introduce a novel and theoretically motivated Class Balancing regularizer for training GANs. Our regularizer makes use of the knowledge from a pre-trained classifier to ensure balanced learning of all the classes in the dataset. This is achieved via modelling the effective class frequency based on the exponential forgetting observed in neural networks and encouraging the GAN to focus on underrepresented classes. We demonstrate the utility of our contribution in two diverse scenarios: (i) Learning representations for long-tailed distributions, where we achieve better performance than existing approaches, and (ii) Generation of Universal Adversarial Perturbations (UAPs) in the data-free scenario for the large scale datasets, where we bridge the gap between data-driven and data-free approaches for crafting UAPs.

1. INTRODUCTION

Image Generation witnessed unprecedented success in recent years following the invention of Generative Adversarial Networks (GANs) by Goodfellow et al. (2014) . GANs have improved significantly over time with the introduction of better architectures (Gulrajani et al., 2017; Radford et al., 2015) , formulation of superior objective functions (Jolicoeur-Martineau, 2018; Arjovsky et al., 2017) , and regularization techniques (Miyato et al., 2018) . An important breakthrough for GANs has been the ability to effectively use the information of class conditioning for synthesizing images (Mirza & Osindero, 2014; Miyato & Koyama, 2018) . Conditional GANs have been shown to scale to large datasets such as ImageNet (Deng et al., 2009) with 1000 classes (Miyato & Koyama, 2018). One of the major issues with unconditional GANs has been their inability to produce balanced distributions over all the classes present in the dataset. This is seen as problem of missing modes in the generated distribution. A version of the missing modes problem, known as the 'covariate shift' problem was studied by Santurkar et al. (2018) . One possible reason may be the absence of knowledge about the class distribution P (Y |X)foot_0 of the generated samples during training. Conditional GANs on the other hand, do not suffer from this issue since the class label Y is supplied to the GAN during training. However, it has been recently found by Ravuri & Vinyals (2019) that despite being able to do well on metrics such as Inception Score (IS) (Salimans et al. (2016) ) and Frèchet Inception Distance (FID) (Heusel et al., 2017) , the samples generated from the state-of-the-art conditional GANs lack diversity in comparison to the underlying training datasets. Further, we observed that although conditional GANs work well in balanced case, they suffer performance degradation in the imbalanced case. In order to address these shortcomings, we propose an orthogonal method (with respect to label conditioning) to induce the information about the class distribution P (Y |X) of generated samples in the GAN framework using a pre-trained classifier. We achieve this by tracking the class distribution of samples produced by the GAN using a pre-trained classifier. The regularizer utilizes the class distribution to penalize excessive generation of samples from the majority classes, thus enforcing



Here Y represents labels and X represents data.1

