ENERGY CONSUMPTION-AWARE TABULAR BENCHMARKS FOR NEURAL ARCHITECTURE SEARCH Anonymous authors Paper under double-blind review

Abstract

The demand for large-scale computational resources for Neural Architecture Search (NAS) has been lessened by tabular benchmarks for NAS. Evaluating NAS strategies is now possible on extensive search spaces and at a moderate computational cost. But so far, NAS has mainly focused on maximising performance on some hold-out validation/test set. However, energy consumption is a partially conflicting objective that should not be neglected. We hypothesise that constraining NAS to include the energy consumption of training the models could reveal a subspace of undiscovered architectures that are more computationally efficient with a smaller carbon footprint. To support the hypothesis, an existing tabular benchmark for NAS is augmented with the energy consumption of each architecture. We then perform multi-objective optimisation that includes energy consumption as an additional objective. We demonstrate the usefulness of multi-objective NAS for uncovering the trade-off between performance and energy consumption as well as for finding more energy-efficient architectures. The updated tabular benchmark, EC-NAS-Bench, is open-sourced to encourage the further exploration of energy consumption-aware NAS.

1. INTRODUCTION

The design of neural architectures is a complex task. While general guidelines for producing suitable neural architectures have been proposed, neural architecture design still requires expert domain knowledge, experience, and not least substantial effort (Philipp, 2021; Zoph & Le, 2016; Ren et al., 2020) . This led to an upsurge in research on automated exploration and design of neural architectures cast as an optimisation problem -neural architecture search (NAS) (Baker et al., 2016; Zoph & Le, 2016; Real et al., 2017) . NAS strategies explore neural architectures in a predefined search space relying on model training and evaluation to determine the model's fitness (i.e., validation/test set score) to adjust the search strategy and extract the best performing architecture (Ren et al., 2020) . NAS strategies have shown great promise in discovering novel architecture designs yielding state-of-the-art model performance (Liu et al., 2017; 2018; Lin et al., 2021; Baker et al., 2017) . However, it can be prohibitively expensive to perform NAS (Tan & Le, 2019b) due to the demand for large-scale computational resources and the associated carbon footprint of NAS (Schwartz et al., 2019; Anthony et al., 2020) . The introduction of tabular benchmarks for NAS significantly lessened the computational challenges mentioned above by facilitating the evaluation of NAS strategies on a limited search space of architectures (Klein & Hutter, 2019; Dong & Yang, 2020) . Predictive models and zero-and one-shot models (Wen et al., 2019; Lin et al., 2021; Zela et al., 2020) have reduced time-consuming model training and thereby increased the efficiency of NAS strategies. Most recently, surrogate NAS benchmarks (Zela et al., 2022) have been proposed for arbitrary expansion of architecture search spaces for NAS. Notwithstanding the aforementioned major contributions to the advancement of NAS research, the prime objective of NAS has been maximising a performance objective on some hold-out test/validation test. NAS strategies can be evaluated effectively, yet the search strategies do not intentionally aim to find computationally efficient architectures. That is, the NAS may efficiently determine model performance at a moderate computational cost, but energy efficiency is generally not an objective of NAS. We hypothesise that adding the energy consumption of training models as a NAS objective could reveal a sub-space of computationally efficient models that also have a smaller carbon footprint. In order to find efficient architectures without sacrificing cardinal performance requirements, we propose the use of NAS strategies that will optimise for multiple objectives. Our main contributions. 1. We provide an energy consumption-aware tabular benchmark for NAS based on NAS-Bench-101 (Ying et al., 2019) . For each architecture, we added its training energy consumption, power consumption and carbon footprint. We hope that the new data set will foster the development of environmentally friendly deep learning systems. 2. We also introduce a surrogate energy model to predict the training energy cost for a given architecture in a large search space (about 423k architectures) 3. To exemplify the use of the new benchmark, we devise a simple multi-objective optimisation algorithm for NAS and apply it to optimise generalisation accuracy as well as energy consumption. 4. We demonstrate the usefulness of multi-objective architecture exploration for revealing the trade-off between performance and energy efficiency and for finding efficient architectures obeying accuracy constraints. This is also demonstrated with other baseline multi-objective methods.

2. ENERGY CONSUMPTION-AWARE BENCHMARKS -EC-NAS-Bench

Our energy consumption-aware tabular benchmark EC-NAS-Bench is based on Nas-Bench-101 (Ying et al., 2019) . We closely follow their specification of architectures; however, the search space of architectures that are considered, the evaluation approach and the metrics provided for each architecture is different. This section will briefly present EC-NAS-Bench and its differences to NAS-Bench-101. Network Topology. All architectures considered are convolutional neural networks (CNNs) designed for the task of image classification on CIFAR-10 (Krizhevsky, 2009) . Each neural network comprises a convolutional stem layer followed by three repeats of three stacked cells and a downsampling layer. Finally, a global pooling layer and a dense softmax layer are used. The space of architectures, X, is limited to the topological space of cells, where each cell is a configurable feedforward network.

2.1. ARCHITECTURE DESIGN

Cell Encoding. The individual cells are represented as directed acyclic graphs (DAGs). Each DAG, G(V, M ), has N = |V | vertices (or nodes) and edges described in the binary adjacency matrix M ∈ {0, 1} N ×N . The set of operations (labels) that each node can realise is given by L ′ = {input, output} ∪ L, where L = {3x3conv, 1x1conv, 3x3maxpool}. Two of the N nodes are always fixed as input and output to the network. The remaining N -2 nodes can take up one of the labels in L. The connections between nodes of the DAG are encoded in the upper-triangular adjacency matrix with no self-connections (zero main diagonal entries). For a given architecture, A, every entry α i,j ∈ M A denotes an edge, from node i to node j with operations i, j ∈ L and its labelled adjacency matrix, L A ∈ M A × L ′ . Search space. The number of DAGs grows exponentially with N and L (Ying et al., 2019) . We restrict the search space in EC-NAS-Bench by imposing |V | ≤ 5 and |A ̸ = 0| ≤ 9, referred to as the 5V space. The search space with |V | ≤ 4 called 4V space is also considered. In contrast, NAS-Bench-101 considers the search space for |V | ≤ 7. With these imposed restrictions on the



Figure 1: Visualisation of the 5V space of architectures with the validation performance P v and the corresponding training energy cost (E) for four training epoch budgets (4,12,36,108).

