RETHINKING THE EFFECT OF DATA AUGMENTATION IN ADVERSARIAL CONTRASTIVE LEARNING

Abstract

Recent works have shown that self-supervised learning can achieve remarkable robustness when integrated with adversarial training (AT). However, the robustness gap between supervised AT (sup-AT) and self-supervised AT (self-AT) remains significant. Motivated by this observation, we revisit existing self-AT methods and discover an inherent dilemma that affects self-AT robustness: either strong or weak data augmentations are harmful to self-AT, and a medium strength is insufficient to bridge the gap. To resolve this dilemma, we propose a simple remedy named DYNACL (Dynamic Adversarial Contrastive Learning). In particular, we propose an augmentation schedule that gradually anneals from a strong augmentation to a weak one to benefit from both extreme cases. Besides, we adopt a fast post-processing stage for adapting it to downstream tasks. Through extensive experiments, we show that DYNACL can improve state-of-the-art self-AT robustness by 8.84% under Auto-Attack on the CIFAR-10 dataset, and can even outperform vanilla supervised adversarial training for the first time.

1. INTRODUCTION

Learning low-dimensional representations of inputs without supervision is one of the ultimate goals of machine learning. As a promising approach, self-supervised learning is rapidly closing the performance gap with respect to supervised learning (He et al., 2016; Chen et al., 2020b) in downstream tasks. However, for whatever supervised and self-supervised learning models, adversarial vulnerability remains a widely-concerned security issue, i.e., natural inputs injected by small and human imperceptible adversarial perturbations can fool the deep neural networks (DNNs) into making wrong predictions (Goodfellow et al., 2014) . In supervised learning, the most effective approach to enhance adversarial robustness is adversarial training (sup-AT) that learns DNNs with adversarial examples (Madry et al., 2017; Wang et al., 2019; Zhang et al., 2019; Wang et al., 2020; Wang & Wang, 2022) . However, sup-AT requires groundtruth labels to craft adversarial examples. In self-supervised learning, recent works including RoCL (Kim et al., 2020), ACL (Jiang et al., 2020), and AdvCL (Fan et al., 2021) explored some adversarial training counterparts (self-AT). However, despite obtaining a certain degree of robustness, there is still a very large performance gap between sup-AT and self-AT methods. As shown in Figure 1 (a), sup-AT obtains 46.2% robust accuracy while state-of-the-art self-AT method only gets 37.6% on CIFAR-10, and the gap is > 8%. As a reference, in standard training (ST) using clean examples, the gap in classification accuracy between sup-ST and self-ST is much smaller (lower than 1% on CIFAR-10, see da Costa et al. (2022) ). This phenomenon leads to the following question: What is the key factor that prevents self-AT from obtaining comparable robustness to sup-AT? To answer this question, we need to examine the real difference between sup-AT and self-AT. As they share the same minimax training scheme, the difference mainly lies in the learning objective.

availability

//github.com/

