SUPPRESSING OUTLIER RECONSTRUCTION IN AUTOENCODERS FOR OUT-OF-DISTRIBUTION DETECTION Anonymous

Abstract

While only trained to reconstruct training data, autoencoders may produce highquality reconstructions of inputs that are well outside the training data distribution. This phenomenon, which we refer to as outlier reconstruction, has a detrimental effect on the use of autoencoders for outlier detection, as an autoencoder will misclassify a clear outlier as being in-distribution. In this paper, we introduce the Energy-Based Autoencoder (EBAE), an autoencoder that is considerably less susceptible to outlier reconstruction. The core idea of EBAE is to treat the reconstruction error as an energy function of a normalized density and to strictly enforce the normalization constraint. We show that the reconstruction of non-training inputs can be suppressed, and the reconstruction error made highly discriminative to outliers, by enforcing this constraint. We empirically show that EBAE significantly outperforms both existing autoencoders and other generative models for several out-of-distribution detection tasks.

1. INTRODUCTION

An autoencoder (Rumelhart et al., 1986 ) is a neural network trained to reconstruct samples from a training data distribution. As the quality of reconstruction is expected to degrade for inputs that are significantly different from training data, autoencoders are widely used in outlier detection (Japkowicz et al., 1995) where an input with a large reconstruction error is classified as out-of-distribution (OOD). Such autoencoders for outlier detection have been applied in domains ranging from video surveillance (Zhao et al., 2017) to medical diagnosis (Lu & Xu, 2018) . Contrary to widely-held belief, autoencoders are in fact capable of accurately reconstructing outliers, casting doubt on their reliability as an outlier detector. Lyudchik (2016) showed that an autoencoder trained on MNIST with the digit seven excluded can reconstruct an image of the excluded digit, and Tong et al. (2019) reported that an autoencoder trained on MNIST can reconstruct an image with all zero pixels. The reconstruction of outliers is also observed for non-image data (Zong et al., 2018) . In this paper, we investigate this unexpected behavior of autoencoders more deeply, which we refer to as outlier reconstruction. In the course of our investigation, we reproduce the findings of Lyudchik (2016) and Tong et al. ( 2019), and additionally discover other interesting cases (Figure 1 ). Our experiments suggest that outlier reconstruction is not a fortuitous artifact of stochastic training but is, in fact, a consequence of inductive biases inherent in an autoencoder. Outlier reconstruction should be suppressed for an autoencoder-based outlier detector, since a reconstructed outlier undermines the detector's performance by being mistaken to be an inlier. Despite the long history of autoencoder research (Rumelhart et al., 1986; Bank et al., 2020) , the outlier reconstruction phenomenon has only recently begun to receive attention (Lyudchik, 2016; Tong et al., 2019; Zong et al., 2018) , with few works explicitly proposing solutions to the outlier reconstruction problem (Gong et al., 2019) . Previous works focused on regularization techniques that prevent an autoencoder from being an identity mapping (and thus reconstructing all inputs). However, outlier reconstruction still occurs in popular regularized autoencoders, including denoising autoencoders (DAE, Vincent et al. (2008) ), variational autoencoders (VAE, Kingma & Welling (2014)), and Wasserstein autoencoders (WAE, Tolstikhin et al. (2017) ), as we shall show in our experiments (Table 1 ).

annex

In this paper, we propose the Energy-based Autoencoder (EBAE), an autoencoder in which the reconstruction of outliers is explicitly suppressed during training. In each training step of EBAE, "fake" samples with small reconstruction error are generated. These well-reconstructed fake samples serve as probes for potential reconstructed outliers. Then, EBAE maximizes the reconstruction errors of the generated samples, while minimizing the reconstruction errors of "real" training samples.When the generated samples become indistinguishable to training data, the gradients from the fake samples and real samples balance, and thus the training converges.The training scheme naturally arises from defining a probability density for EBAE using its reconstruction error. The density of EBAE is given as p θ (x) = exp(-E(x))/Ω, where E(x) is the reconstruction error of x and Ω is a normalization constant. This formulation of defining a density using a scalar function is often called an energy-based model in the literature (Mnih & Hinton, 2005; Hinton et al., 2006) , and E(x) is called the energy of the density. Maximizing likelihood in this formulation results in contrastive divergence learning (Hinton, 2002) , which minimizes the energy of the training data while maximizing the energy of the samples from the model.When generating samples with small reconstruction error during training, we use a novel sampling scheme specifically designed for EBAE. Our sampling scheme is based on Langevin Monte Carlo but leverages the latent space of an autoencoder to generate diverse samples which facilitates the training of EBAE.Setting the reconstruction error as the energy, EBAE incorporates two major outlier detection criteria, large reconstruction error (Japkowicz et al., 1995) and low likelihood (Bishop, 1994) , since the two are equivalent in EBAE. Generally, the two criteria do not necessarily overlap in other methods, e.g., VAE or energy-based models (Zhai et al., 2016) . Recent studies show that a likelihood-based outlier detector using a deep generative model, such as an auto-regressive model or flow-based model, fails to correctly classify certain obvious outliers (Nalisnick et al., 2019; Hendrycks et al., 2019) . However, EBAE is able to detect such outliers successfully while still using likelihood as the decision criterion.The contributions of our paper can be summarized as follows:• We report and investigate various cases of outlier reconstruction in autoencoders;• We propose EBAE, an autoencoder significantly less prone to outlier reconstruction;• We present a sampling method tailored for EBAE which efficiently generates diverse samples;• We empirically show that EBAE is highly effective for outlier detection.Section 2 provides a brief introduction on autoencoder-based outlier detection. In Section 3, we investigate outlier reconstruction in depth with illustrative examples. Section 4 describes EBAE. Related works are reviewed in Section 5. Section 6 presents experimental results. Section 7 concludes the paper.

2.1. PROBLEM SETTING

In this paper, we consider the outlier detection problem, which is also referred to as novelty detection, open set recognition, or OOD detection in literature. The goal is to classify outliers from

