AUTOJOIN: EFFICIENT ADVERSARIAL TRAINING FOR ROBUST MANEUVERING VIA DENOISING AUTOEN-CODER AND JOINT LEARNING

Abstract

As a result of increasingly adopted machine learning algorithms and ubiquitous sensors, many 'perception-to-control' systems are developed and deployed. For these systems to be trustworthy, we need to improve their robustness with adversarial training being one approach. We propose a gradient-free adversarial training technique, called AutoJoin, which is a very simple yet effective and efficient approach to produce robust models for imaged-based maneuvering. Compared to other SOTA methods with testing on over 5M perturbed and clean images, AutoJoin achieves significant performance increases up to the 40% range under gradient-free perturbations while improving on clean performance up to 300%. Regarding efficiency, AutoJoin demonstrates strong advantages over other SOTA techniques by saving up to 83% time per training epoch and 90% training data. Although not the focus of AutoJoin, it even demonstrates superb ability in defending gradient-based attacks. The core idea of AutoJoin is to use a decoder attachment to the original regression model creating a denoising autoencoder within the architecture. This architecture allows the tasks 'maneuvering' and 'denoising sensor input' to be jointly learnt and reinforce each other's performance. Recently, Shen et al. (2021) has developed a gradient-free adversarial training technique against image perturbations. Their work uses Fréchet Inception Distance (FID) (Heusel et al., 2017) to select distinct intensity levels of perturbations. During training, the intensity that minimizes the current

1. INTRODUCTION

The wide adoption of machine learning algorithms and ubiquitous sensors have together resulted in numerous tightly-coupled 'perception-to-control' systems being deployed in the wild. In order for these systems to be trustworthy, robustness is an integral characteristic to be considered in addition to their effectiveness. Adversarial training aims to increase the robustness of machine learning models by exposing them to perturbations that arise from artificial attacks (Goodfellow et al., 2014; Madry et al., 2017) or natural disturbances (Shen et al., 2021) . In this work, we focus on the impact of these perturbations on image-based maneuvering and the design of efficient adversarial training for obtaining robust models. The test task is 'maneuvering through a front-facing camera'-which represents one of the hardest perception-to-control tasks since the input images are taken from partially observable, nondeterministic, dynamic, and continuous environments. Inspired by the finding that model robustness can be improved through learning with simulated perturbations (Bhagoji et al., 2018) , effective techniques such as AugMix (Hendrycks et al., 2019b ), AugMax (Wang et al., 2021 ), MaxUp (Gong et al., 2021 ), and AdvBN (Shu et al., 2020) have been introduced for language modeling, and image-based classification and segmentation. The focus of these studies is not efficient adversarial training for robust maneuvering. AugMix is less effective to gradient-based adversarial attacks due to the lack of sufficiently intense augmentations; AugMax, based on AugMix, is less efficient because of using a gradient-based adversarial training procedure, which is also a limitation of AdvBN. MaxUp requires multiple forward passes for a single data point to determine the most harmful perturbation, which increases computational costs and time proportional to the number of extra passes. Shen et al. (2021) represents the SOTA, gradient-free adversarial training method for achieving robust maneuvering against image perturbations. Their technique adopts Fréchet Inception Distance (FID) (Heusel et al., 2017) to first determine distinct intensity levels of the perturbations that minimize model performance. Afterwards, datasets of single perturbations are generated. Before each round of training, the dataset that can minimize model performance is selected and incorporated with the clean dataset for training. A fine-tuning step is also introduced to boost model performance on clean images. While effective, examining the perturbation parameter space via FID adds complexity to the approach and using distinct intensity levels limits the model generalizability and hence robust efficacy. The approach also requires the generation of many datasets (in total 2.1M images) prior to training, burdening computation and storage. Additional inefficiency and algorithmic complexity occur at training as the pre-round selection of datasets requires testing against perturbed datasets, resulting in a large amount of data passing through the model.

Recent work by

We aim to develop a gradient-free, efficient adversarial training technique for robust maneuvering. Fig. 1 illustrates our effective and algorithmically simple approach, AutoJoin, where we divide a steering angle prediction model into an encoder and a regression head. The encoder is attached by a decoder to form a denoising autoencoder (DAE). The motivation for using the DAE alongside the prediction model is the assumption that prediction on clean data should be easier than on perturbed data. The DAE and the prediction model are jointly learnt: when perturbed images are forward passed, the reconstruction loss is added with the regression loss, enabling the encoder to simultaneously improve on 'maneuvering' and 'denoising sensor input.' AutoJoin enjoys efficiency as the additional computational cost stems only from passing the intermediate features through the decoder. Algorithmic complexity is kept simple as perturbations are randomly sampled within a moving range that is determined by linear curriculum learning (Bengio et al., 2009) . The FID is used only minimally to determine the maximum intensity value of a perturbation. The model generalizability and robustness is improved as more parameter space of the perturbation is explored, and the fact that 'denoising sensor input' provides the denoised training data for 'maneuvering.' We have tested AutoJoin on four real-world driving datasets, namely Honda (Ramanishka et al., 2018) , Waymo (Sun et al., 2020 ), Audi (Geyer et al., 2020) , and SullyChen (Chen, 2017), totaling over 5M clean and perturbed images. The results show that AutoJoin achieves the best performance on the maneuvering task while being the most efficient. For example, AutoJoin achieves 3x the improvement on clean data over Shen et al. AutoJoin also outperforms them up to 20% in accuracy and 43% in error reduction using the Nvidia (Bojarski et al., 2016) Although not the focus of AutoJoin, it also demonstrates superb ability in defending gradient-based attacks by outperforming every other approaches tested. We hope the results and design of AutoJoin will assist the robustness development of other perception-to-control applications especially considering similar supervised learning tasks are likely to be ubiquitous in autonomous industry and the vulnerability of a machine learning model to various perturbations on sensor input. 



backbone, and up to 44% error reduction compared to other adversarial training techniques when using the ResNet-50 (He et al., 2016) backbone. AutoJoin is very efficient as it saves 21% per epoch time compared to the next fastest, AugMix (Hendrycks et al., 2019b), and saves 83% per epoch time and 90% training data compared to the method by Shen et al. (2021).

Next, we introduce techniques for improving model robustness against simulated image perturbations, and studies that use a denoising autoencoder (DAE) to improve model robustness of the driving task. So far, most adversarial training techniques against image perturbations have focused on image classification. To list some examples, AugMix (Hendrycks et al., 2019b) is a technique that enhances model robustness and generalizability by layering randomly sampled augmentations together. Aug-Max (Wang et al., 2021), a derivation of AugMix, trains on AugMix-generated images and their gradient-based adversarial variants. MaxUp (Gong et al., 2021) stochastically generates multiple augmented images of a single image and trains the model on the perturbed image that minimizes the model's performance. As a result, MaxUp requires multiple passes of a data point through the model for determining the most harmful perturbation. AdvBN (Shu et al., 2020) is a gradient-based adversarial training technique that switches between batch normalization layers based on whether the training data is clean or perturbed. It achieves SOTA performance when used with techniques such as AugMix on ImageNet-C (Hendrycks & Dietterich, 2019).

