UNBIASED SUPERVISED CONTRASTIVE LEARNING

Abstract

Many datasets are biased, namely they contain easy-to-learn features that are highly correlated with the target class only in the dataset but not in the true underlying distribution of the data. For this reason, learning unbiased models from biased data has become a very relevant research topic in the last years. In this work, we tackle the problem of learning representations that are robust to biases. We first present a margin-based theoretical framework that allows us to clarify why recent contrastive losses (InfoNCE, SupCon, etc.) can fail when dealing with biased data. Based on that, we derive a novel formulation of the supervised contrastive loss (ϵ-SupInfoNCE), providing more accurate control of the minimal distance between positive and negative samples. Furthermore, thanks to our theoretical framework, we also propose FairKL, a new debiasing regularization loss, that works well even with extremely biased data. We validate the proposed losses on standard vision datasets including CIFAR10, CIFAR100, and ImageNet, and we assess the debiasing capability of FairKL with ϵ-SupInfoNCE, reaching stateof-the-art performance on a number of biased datasets, including real instances of biases "in the wild".

1. INTRODUCTION

Deep learning models have become the predominant tool for learning representations suited for a variety of tasks. Arguably, the most common setup for training deep neural networks in supervised classification tasks consists in minimizing the cross-entropy loss. Cross-entropy drives the model towards learning the correct label distribution for a given sample. However, it has been shown in many works that this loss can be affected by biases in the data (Alvi et al., 2018; Kim et al., 2019; Nam et al., 2020; Sagawa et al., 2019; Tartaglione et al., 2021; Torralba et al., 2011) or suffer by noise and corruption in the labels Elsayed et al. (2018); Graf et al. (2021) . In fact, in the latest years, it has become increasingly evident how neural networks tend to rely on simple patterns in the data (Geirhos et al., 2019; Li et al., 2021) . As deep neural networks grow in size and complexity, guaranteeing that they do not learn spurious elements in the training set is becoming a pressuring issue to tackle. It is indeed a known fact that most of the commonly-used datasets are biased (Torralba et al., 2011) and that this affects the learned models (Tommasi et al., 2017) . In particular, when the biases correlate very well with the target task, it is hard to obtain predictions that are independent of the biases. This can happen, e.g., in presence of selection biases in the data. Furthermore, if the bias is easy to learn (e.g. a simple pattern or color), we will most likely obtain a biased model, whose predictions majorly rely on these spurious attributes and not on the true, generalizable, and discriminative features. Learning fair and robust representations of the underlying samples, especially when dealing with highly-biased data, is the main objective of this work. Contrastive learning has recently gained attention for this purpose, showing superior robustness to cross-entropy Graf et al. (2021) . For this reason, in this work, we adopt a metric learning approach for supervised representation learning. Based on that, we provide a unified framework to analyze and compare existing formulations of contrastive losses 1 such as the InfoNCE loss (Chen et al., 2020; Oord et al., 

