EMPIRICAL STUDIES ON THE CONVERGENCE OF FEA-TURE SPACES IN DEEP LEARNING Anonymous

Abstract

While deep learning is effective to learn features/representations from data, the distributions of samples in feature spaces learned by various architectures for different training tasks, e.g., latent layers of Autoencoders (AEs) and feature vectors in Convolutional Neural Network (CNN) classifiers, have not been well-studied or compared. We hypothesize that the feature spaces of networks trained by various architectures (AEs or CNNs) and tasks (supervised, unsupervised, or selfsupervised learning) share some common subspaces, no matter what types of architectures or whether the labels have been used in feature learning. To test our hypothesis, through Singular Value Decomposition (SVD) of feature vectors, we demonstrate that one could linearly project the feature vectors of the same group of samples to a similar distribution, where the distribution is represented as the top left singular vector (i.e., principal subspace of feature vectors), namely P-vector. We further assess the convergence of feature space learning using angles between P-vectors obtained from the well-trained model and its checkpoint per epoch during the learning procedure, where a quasi-monotonic converging trend from nearly orthogonal to smaller angles (e.g., 10 • ) has been observed. Finally, we carry out case studies to connect P-vectors to the data distribution, and generalization performance. Extensive experiments with practically-used Multi-Layer Perceptron (MLP), AE and CNN architectures for classification, image reconstruction, and self-supervised learning tasks on MNIST, CIFAR-10 and CIFAR-100 datasets have been done to support our claims with solid evidences.

1. INTRODUCTION

Blessed by the capacities of feature learning, deep neural networks (DNNs) (LeCun et al., 2015) have been widely used to perform learning tasks, ranging from classification, to generation (Goodfellow et al., 2014; Radford et al., 2015) , in various settings (e.g., supervised, unsupervised, and selfsupervised learning). To better analyze the features learned by deep models, numerous works have studied on interpreting the features spaces of the well-trained models (Simonyan et al., 2013; White, 2016; Zhu et al., 2016; Bau et al., 2017; 2019; Jahanian et al., 2020; Zhang & Wu, 2020) . Invariance beyond the use of architectures and labels. While existing studies primarily focus on the interpolation of a given model to discover mappings from the feature space to outputs of the model (e.g., classification (Bau et al., 2017) and generation (Jahanian et al., 2020)), the work is so few that compares the feature spaces learned by deep models of varying architectures (e.g., MLP/CNN classifiers versus Autoencoders) for different learning paradigms (Chen et al., 2020; Khosla et al., 2020; Spinner et al., 2018) . More specifically, we are particularly interested in whether there exists certain "statistical invariance" in the feature space, no matter what type of architectures or whether label information (e.g., supervised vs. unsupervised vs. self-supervised (Chen et al., 2020) learning) are used in feature learning with the same training dataset. Hypotheses. It is not difficult to imagine that the feature spaces of well-trained DNN classifiers in supervised learning setting might share some linear subspace (Vaswani et al., 2018) . When models are well fitted to the same training set, the feature vectors of training samples should be projected to the ground-truth labels after a Fully-Connected Layer (i.e., a linear transform), while such linear subspace are supposed to distribute samples in a discriminative manner. We doubt that such

