DeepPipe: DEEP, MODULAR AND EXTENDABLE REPRESENTATIONS OF MACHINE LEARNING PIPELINES

Abstract

Finding accurate Machine Learning pipelines is essential in achieving state-of-theart AI predictive performance. Unfortunately, most existing Pipeline Optimization techniques rely on flavors of Bayesian Optimization that do not explore the deep interaction between pipeline stages/components (e.g. between hyperparameters of the deployed preprocessing algorithm and the hyperparameters of a classifier). In this paper, we are the first to capture the deep interaction between components of a Machine Learning pipeline. We propose embedding pipelines in a deep latent representation through a novel per-component encoder mechanism. Such pipeline embeddings are used with deep kernel Gaussian Process surrogates inside a Bayesian Optimization setup. Through extensive experiments on three largescale meta-datasets, including Deep Learning pipelines for computer vision, we demonstrate that learning pipeline embeddings achieves state-of-the-art results in Pipeline Optimization.

1. INTRODUCTION

Machine Learning (ML) has proven to be successful in a wide range of tasks such as image classification, natural language processing, and time series forecasting. In a supervised learning setup, practitioners need to define a sequence of stages comprising algorithms that transform the data (e.g. imputation, scaling) and produce an estimation (e.g. through a classifier or regressor). The selection of the algorithms and their hyperparameters, known as Pipeline Optimization (Olson & Moore, 2016) or pipeline synthesis (Liu et al., 2020; Drori et al., 2021) is challenging. Firstly, the search space contains conditional hyperparameters, as only some of them are active depending on the selected algorithms. Secondly, this space is arguably bigger than the one for a single algorithm. Therefore, previous work demonstrates how this pipeline search can be automatized and achieve competitive results (Feurer et al., 2015; Olson & Moore, 2016) . Some of these approaches include Evolutionary Algorithms (Olson & Moore, 2016), Reinforcement Learning (Rakotoarison et al., 2019; Drori et al., 2021) or Bayesian Optimization (Feurer et al., 2015; Thornton et al., 2012; Alaa & van der Schaar, 2018) . Pipeline Optimization (PO) techniques need to capture the complex interaction between the algorithms of a Machine Learning pipeline and their hyperparameter configurations. Unfortunately, no prior method uses Deep Learning to encapsulate the interaction between pipeline components. Prior work trains performance predictors (a.k.a. surrogates) on the concatenated hyperparameter space of all algorithms (raw search space), for instance, using random forests (Feurer et al., 2015) or finding groups of hyperparameters to use on kernels with additive structure (Alaa & van der Schaar, 2018) . On the other hand, transfer learning has been shown to decisively improve PO by transferring efficient pipelines evaluated on other datasets (Fusi et al., 2018; Yang et al., 2019; 2020) . Our method is the first to introduce a deep pipeline representation that is meta-learned to achieve state-of-the-art results in terms of the quality of the discovered pipelines. We introduce DeepPipe, a neural network architecture for embedding pipeline configurations on a latent space. Such deep representations are combined with Gaussian Processes (GP) for tuning pipelines with Bayesian Optimization (BO). We exploit the knowledge of the hierarchical search space of pipelines by mapping the hyperparameters of every algorithm through per-algorithm encoders to a hidden representation, followed by a Fully Connected Network that receives the concatenated representations as input. Additionally, we show that meta-learning this network through evaluations

