SEARCHING FOR CONVOLUTIONS AND A MORE AMBITIOUS NAS

Abstract

An important goal of neural architecture search (NAS) is to automate-away the design of neural networks on new tasks in under-explored domains, thus helping to democratize machine learning. However, current NAS research largely focuses on search spaces consisting of existing operations-such as different types of convolution-that are already known to work well on well-studied problemsoften in computer vision. Our work is motivated by the following question: can we enable users to build their own search spaces and discover the right neural operations given data from their specific domain? We make progress towards this broader vision for NAS by introducing a space of operations generalizing the convolution that enables search over a large family of parameterizable linear-time matrix-vector functions. Our flexible construction allows users to design their own search spaces adapted to the nature and shape of their data, to warm-start search methods using convolutions when they are known to perform well, or to discover new operations from scratch when they do not. We evaluate our approach on several novel search spaces over vision and text data, on all of which simple NAS search algorithms can find operations that perform better than baseline layers.

1. INTRODUCTION

Neural architecture search is often motivated by the AutoML vision of democratizing ML by reducing the need for expert deep net design, both on existing problems and in new domains. However, while NAS research has seen rapid growth with developments such as weight-sharing (Pham et al., 2018) and "NAS-benches" (Ying et al., 2019; Zela et al., 2020) , most efforts focus on search spaces that glue together established primitives for well-studied tasks like vision and text (Liu et al., 2019; Li & Talwalkar, 2019; Xu et al., 2020; Li et al., 2020) or on deployment-time issues such as latency (Cai et al., 2020) . Application studies have followed suit (Nekrasov et al., 2019; Wang et al., 2020) . In this work, we revisit a broader vision for NAS, proposing to move towards much more general search spaces while still exploiting successful components of leading network topologies and efficient NAS methods. We introduce search spaces built using the Chrysalis,foot_0 a rich family of parameterizable operations that we develop using a characterization of efficient matrix transforms by Dao et al. (2020) and which contain convolutions and many other simple linear operations. When combined with a backbone architecture, the Chrysalis induces general NAS search spaces for discovering the right operation for a given type of data. For example, when inducing a novel search space from the LeNet architecture (LeCun et al., 1999) , we show that randomly initialized gradient-based NAS methods applied to CIFAR-10 discover operations in the Chrysalis that outperform convolutionsthe "right" operation for vision-by 1% on both CIFAR-10 and CIFAR-100. Our contributions, summarized below, take critical steps towards a broader NAS that enables the discovery of good design patterns with limited human specification from data in under-explored domains: • We define the broad NAS problem and discuss how it interacts with modern techniques such as continuous relaxation, weight-sharing, and bilevel optimization. This discussion sets up our new approach for search space design and our associated evaluations of whether leading NAS methods, applied to our proposed search spaces, can find good parameterizable operations. • We introduce Kaleidoscope-operations (K-operations), parameterizable operations comprising the Chrysalis that generalize the convolution while preserving key desirable properties: short



FollowingDao et al. (2020), butterfly-based naming will be used throughout.1

