EA-HAS-BENCH: ENERGY-AWARE HYPERPARAME-TER AND ARCHITECTURE SEARCH BENCHMARK

Abstract

The energy consumption for training deep learning models is increasing at an alarming rate due to the growth of training data and model scale, resulting in a negative impact on carbon neutrality. Energy consumption is an especially pressing issue for AutoML algorithms because it usually requires repeatedly training large numbers of computationally intensive deep models to search for optimal configurations. This paper takes one of the most essential steps in developing energy-aware (EA) NAS methods, by providing a benchmark that makes EA-NAS research more reproducible and accessible. Specifically, we present the first large-scale energy-aware benchmark that allows studying AutoML methods to achieve better trade-offs between performance and search energy consumption, named EA-HAS-Bench. EA-HAS-Bench provides a large-scale architecture/hyperparameter joint search space, covering diversified configurations related to energy consumption. Furthermore, we propose a novel surrogate model specially designed for large joint search space, which proposes a Bézier curve-based model to predict learning curves with unlimited shape and length. Based on the proposed dataset, we modify existing AutoML algorithms to consider the search energy consumption, and our experiments show that the modified energy-aware AutoML methods achieve a better trade-off between energy consumption and model performance.

1. INTRODUCTION

As deep learning technology progresses rapidly, its alarming increased rate of energy consumption causes growing concerns (Schwartz et al., 2020; Li et al., 2021a; Strubell et al., 2019) . Neural architecture search (NAS) (Elsken et al., 2019) , hyperparameter optimization (HPO) (Feurer & Hutter, 2019) lifted the manual effort of neural architecture and hyperparameter tuning but require repeatedly training large numbers of computationally intensive deep models, leading to significant energy consumption and carbon emissions. For instance, training 10K models on CIFAR-10 (Krizhevsky et al., 2009) for 100 epochs consume about 500,000 kWh of energy power, which is equivalent to the annual electricity consumption of about 600 households in China. As a result, it is essential to develop search energy cost aware (EA) AutoML methods, which are able to find models with good performance while minimizing the overall energy consumption throughout the search process. However, existing NAS studies mainly focus on the resource cost of the searched deep model, such as parameter size, the number of float-point operations (FLOPS), or latency (Tan et al., 2019; Wu et al., 2019; He et al., 2021) . Exploiting the trade-off between model performance and energy cost during the searching process has been rarely studied (Elsken et al., 2019) . In this paper, we propose to take one of the most essential steps in developing energy-aware (EA) NAS methods that make EA-NAS research more reproducible and accessible. Specifically, we provide a benchmark for EA-NAS called Energy Aware Hyperparameter and Architecture Search Benchmark (EA-HAS-Bench), where the researchers can easily obtain the training energy cost and model performance of a certain architecture and hyperparameter configuration, without actually training the model. In order to support developing energy-aware HPO and NAS methods, the proposed EA-HAS-Bench should satisfy three requirements. et al., 2022; Yan et al., 2021) are not applicable to our proposed large-scale joint space. For instance, those methods do not consider the situation of maximum epoch variation and predict only a fixed-length learning curve or final performance. Thus, we propose the Bézier Curve-based Surrogate (BCS) model to fit accuracy learning curves of unlimited shape and length. We summarize the contribution of this paper as follows: • EA-HAS-Bench is the first HAS dataset aware of the overall search energy cost. * . Based on this dataset, we further propose a energy-aware AutoML method with search energy cost related penalties, showing energy-aware AutoML achieves a better trade-off between model performance and search energy cost. • We provide the first large-scale benchmark containing an architecture/hyperparameter joint search space with over 10 billion configurations, covering various configurations related to search energy cost. • We develop a novel surrogate model suitable for more general and complex joint HAS search space, which outperforms NAS-Bench-X11 and other parametric learning curvebased methods.



* The dataset and codebase of EA-HAS-Bench are available at https://github.com/microsoft/EA-HAS-Bench.



Overview of NAS benchmarks related to EA-HAS-BenchSearch Energy Cost Our dataset needs to provide the total search energy cost of running a specific AutoML method. This can be obtained by measuring the energy cost of each particular configuration the method traverses and summing them up. As shown in Table1, most of the existing conventional benchmarks(Ying et al., 2019; Dong & Yang, 2020; Siems et al., 2020)  like NAS-Bench-101 do not directly provide training energy cost but use model training time as the training resource budget, which as verified by our experiments, is an inaccurate estimation of energy cost. HW-NAS-bench (Li et al., 2021b) provides the inference latency and inference energy consumption of different model architectures but also does not provide the search energy cost. Hirose et al., 2021) use small toy search space which does not cover enough critical factors affecting the search energy cost. As a result, EA-HAS-Bench provides the first dataset with a ten-billion-level architecture/hyperparameter joint space, covering diverse types of configurations related to search energy cost.Surrogate Model for Joint Search Space Due to the enormous size of the proposed joint search space, a surrogate model is needed to fill out the entire search space with a subset of sampled configurations. Existing surrogate methods (Zela

