DYNAMIC BATCH NORM STATISTICS UPDATE FOR NATURAL ROBUSTNESS

Abstract

DNNs trained on natural clean samples have been shown to perform poorly on corrupted samples, such as noisy or blurry images. Various data augmentation methods have been recently proposed to improve DNN's robustness against common corruptions. Despite their success, they require computationally expensive training and cannot be applied to off-the-shelf trained models. Recently, updating only BatchNorm (BN) statistics of a model on a single corruption has been shown to improve its accuracy on that corruption significantly. However, adopting the idea at inference time when the type of corruption changes decreases the effectiveness of this method. In this paper, we harness the Fourier domain to detect the corruption type, a challenging task in the image domain. We propose a unified framework consisting of a corruption-detection model and BN statistics update that can improve the corruption accuracy of any off-the-shelf trained model. We benchmark our framework on different models and datasets. Our results demonstrate about 8% and 4% accuracy improvement on CIFAR10-C and ImageNet-C, respectively. Furthermore, our framework can further improve the accuracy of state-of-the-art robust models, such as AugMix and DeepAug.

1. INTRODUCTION

Deep neural networks (DNNs) have been successfully applied to solve various vision tasks in recent years. At inference time, DNNs generally perform well on data points sampled from the same distribution as the training data. However, they often perform poorly on data points of different distribution, including corrupted data, such as noisy or blurred images. These corruptions often appear naturally at inference time in many real-world applications, such as cameras in autonomous cars, x-ray images, etc. Not only DNNs' accuracy drops across shifts in the data distribution, but also the well-known overconfidence problem of DNNs impedes the detection of domain shift. One straightforward approach to improve the robustness against various corruptions is to augment the training data to cover various corruptions. Recently, many more advanced data augmentation schemes have also been proposed and shown to improve the model robustness on corrupted data, such as SIN Geirhos et al. Two recent works (Benz et al., 2021; Schneider et al., 2020) proposed a simple batch normalization (BN) statistics update to improve the robustness of a pre-trained model against various corruptions with minimal computational overhead. The idea is to only update the BN statistics of a pre-trained model on a target corruption. If the corruption type is unknown beforehand, the model can keep BNs updating at inference time to adapt to the ongoing corruption. Despite its effectiveness, this approach is only suitable when a constant flow of inputs with the same type of corruption is fed to the model so that it can adjust the BN stats accordingly. In this work, we first investigate how complex the corruption type detection task itself would be. Although corruption type detection is challenging in the image domain, employing the Fourier domain can make it much more manageable because each corruption has a relatively unique frequency profile. We show that a very simple DNN can modestly detect corruption types when fed with a specifically normalized frequency spectrum. Given the ability to detect corruption types in the Fourier domain, we adopt the BN statistic update method such that it can change the BN values dynamically based on the detected corruption type. The overall architecture of our approach is depicted in Figure 1 . First, we calculate the Fourier transform of the input image, and after applying a specifically designed normalization, it is fed to the corruption type detection DNN. Based on the detected corruption, we fetch the corresponding BN statistics from the BN stat lookup table, and the pre-trained network BNs are updated accordingly. Finally, the dynamically updated pre-trained network processes the original input image. In summary, our contributions are as follows: • We harness the frequency spectrum of an image to identify the corruption type. On ImageNet-C, a shallow 3-layer fully connected neural network can identify 16 different corruption types with 65.88% accuracy. The majority of the misclassifications occur between similar corruptions, such as different types of noise, for which the BN stat updates are similar nevertheless. • Our framework can be used on any off-the-shelf pre-trained model, even robustly trained models, such as AugMix Hendrycks et al. ( 2019) and DeepAug Hendrycks et al. ( 2021), and further improves the robustness. • We demonstrate that updating BN statistics at inference time as suggested in (Benz et al., 2021; Schneider et al., 2020) does not achieve good performance when the corruption type does not continue to be the same for a long time. On the other hand, our framework is insensitive to the rate of corruption changes and outperforms these methods when dealing with dynamic corruption changes. In (Benz et al., 2021; Schneider et al., 2020) , a simple BN statistic update has significantly improved the natural robustness of trained DNNs. Figure 2 show the effectiveness of their approach on various corruption types. The drawback of their approach is that the BN statistics obtained for one type of corruption often significantly degrades the accuracy for other types of corruption, except for similar corruption, such as different types of noise. The authors claim that in many applications, such as autonomous vehicles, the corruption type will remain the same for a considerable amount of time.

2. METHOD

Consequently, the BN statistics can be updated at inference time. However, neither of those papers



(2018a), ANT Rusak et al. (2020a), AugMix Hendrycks et al. (2019), and DeepAug Hendrycks et al. (2021). Despite their effectiveness, these approaches require computationally expensive training or re-training process.

OVERALL FRAMEWORKThe overview of our framework is depicted in Figure1. It consists of three main modules: A) a pre-trained model on the original task, such as object detection, B) a DNN trained to detect corruption type, and C) a lookup table storing BN statistics corresponding to each type of corruption. This paper mainly focuses on improving the natural robustness of trained DNNs. However, the framework can be easily extended to domain generalization and circumstances where the lookup table may update the entire model weights or even the model architecture itself.

Figure 1: Overall Framework

