PERFORMANCE PREDICTION VIA UNSUPERVISED DO-MAIN ADAPTATION FOR ARCHITECTURE SEARCH

Abstract

Performance predictors can directly predict the performance value of given neural architectures without training, thus broadly being studied to alleviate the prohibitive cost of Neural Architecture Search (NAS). However, existing performance predictors still require training a large number of architectures from scratch to get their performance labels as the training dataset, which is still computationally expensive. To solve this issue, we develop an performance predictor by applying the unsupervised domain adaptation technique called USPP, which can avoid costly dataset construction by using existing fully-trained architectures. Specifically, a progressive domain-invariant feature extraction method is proposed to assist in extracting domain-invariant features due to the great transferability challenge caused by the rich domain-specific features. Furthermore, a learnable representation (denoted as operation embedding) is designed to replace the fixed encoding of the operations to transfer more knowledge about operations to the target search space. In experiments, we train the predictor by the labeled architectures in NAS-Bench-101 and predict the architectures in the DARTS search space. Compared with other state-of-the-art NAS methods, the proposed USPP only costs 0.02 GPU days but finds the architecture with 97.86% on CIFAR-10 and 76.50% top-1 accuracy on ImageNet.

1. INTRODUCTION

Neural Architecture Search (NAS) (Elsken et al., 2019) aims to automatically design highperformance neural architectures and has been a popular research field of machine learning. In recent years, the architectures searched by NAS have outperformed manually-designed architectures in many fields (Howard et al., 2019; Real et al., 2019) . However, NAS generally requires massive computation resources to estimate the performance of architectures obtained during the search process (Real et al., 2019; Zoph et al., 2018) . In practice, this is unaffordable for most researchers interested. As a result, how to speed up the estimation of neural architectures has become a hot topic among the NAS community. Performance predictor (Wen et al., 2020 ) is a popular accelerated method for NAS. It can directly predict the performance of neural architectures without training, thus greatly accelerating the NAS process. A large number of related works are carried out because of its superiority in reducing the costs of NAS. For example, E2EPP (Sun et al., 2019) adopted a random forest (Breiman, 2001) as the regression model to effectively find promising architectures. ReNAS (Xu et al., 2021) used a simple LeNet-5 network (LeCun et al., 1998) as the regression model, and creatively employed a ranking-based loss function to train the predictor, thus improving the prediction ability of the performance predictor. Although existing performance predictors gain huge success in improving the efficiency of NAS, sufficient architectures need to be sampled from the target search space and be fully trained to obtain their performance value as the label (Wen et al., 2020) . The performance predictor is trained by these labeled architectures, and then is used to predict the performance of architectures. In order to ensure the prediction accuracy of the performance predictor, it is usually necessary to train at least hundreds of architectures as the dataset, which is a huge cost. In recent years, many benchmark datasets such as NAS-Bench-101 (Ying et al., 2019) , NAS-Bench-201 (Dong & Yang, 2020) , NAS-Bench-NLP (Klyuchnikov et al., 2020) are released for promoting the research on NAS. There are a large number of architecture pairs (i.e., the architecture and its

