PARAMETRIC UMAP: LEARNING EMBEDDINGS WITH DEEP NEURAL NETWORKS FOR REPRESENTATION AND SEMI-SUPERVISED LEARNING

Abstract

We propose Parametric UMAP, a parametric variation of the UMAP (Uniform Manifold Approximation and Projection) algorithm. UMAP is a non-parametric graph-based dimensionality reduction algorithm using applied Riemannian geometry and algebraic topology to find low-dimensional embeddings of structured data. The UMAP algorithm consists of two steps: (1) Compute a graphical representation of a dataset (fuzzy simplicial complex), and (2) Through stochastic gradient descent, optimize a low-dimensional embedding of the graph. Here, we replace the second step of UMAP with a deep neural network that learns a parametric relationship between data and embedding. We demonstrate that our method performs similarly to its non-parametric counterpart while conferring the benefit of a learned parametric mapping (e.g. fast online embeddings for new data). We then show that UMAP loss can be extended to arbitrary deep learning applications, for example constraining the latent distribution of autoencoders, and improving classifier accuracy for semi-supervised learning by capturing structure in unlabeled data. 1 



Current non-linear dimensionality reduction algorithms can be divided broadly into non-parametric algorithms which rely on the efficient computation of probabilistic relationships from neighborhood graphs to extract structure in large datasets (e.g. UMAP (McInnes et al., 2018 ), t-SNE (van der Maaten & Hinton, 2008) , LargeVis (Tang et al., 2016) ), and parametric algorithms, which, driven by advances in deep-learning, optimize an objective function related to capturing structure in a dataset over neural network weights (e.g. Hinton & Salakhutdinov 2006; Ding et al. 2018; Ding & Regev 2019; Szubert et al. 2019; Kingma & Welling 2013) . The goal of this paper is to wed those two classes of methods: learning a structured graphical representation of the data and using a deep neural network to embed that graph. Over the past decade several varients of the t-SNE algorithm have proposed parameterized forms of t-SNE (Van Der Maaten, 2009; Gisbrecht et al., 2015; Bunte et al., 2012; Gisbrecht et al., 2012) . In particular, Parametric t-SNE (Van Der Maaten, 2009) performs exactly that wedding; training a deep neural network to minimize loss over a t-SNE graph. However, the t-SNE loss function itself is not well suited to be optimized over deep neural networks using contemporary training schemes. In particular, t-SNE's optimization requires normalization over the entire dataset at each step of optimization, making batch-based optimization and on-line learning of large datasets difficult. In contrast, UMAP is optimized using negative sampling (Mikolov et al., 2013; Tang et al., 2016) and requires no normalization step, making it more well-suited to deep learning applications. Our proposed method, Parametric UMAP, brings the non-parametric graph-based dimensionality reduction algorithm UMAP into a emerging class of parametric topologically-inspired embedding algorithms (Reviewed in A.5). In the following section we broadly outline the algorithm underlying UMAP to explain why our proposed algorithm, Parametric UMAP, is particularly well suited to deep learning applications. We contextualize our discussion of UMAP in t-SNE, to outline the advantages that UMAP confers over t-SNE in the domain of parametric neural-network based embedding. We then perform experiments 1 Google Colab walkthrough 1

