REG-NAS: GRAPH NEURAL NETWORK ARCHITEC-TURE SEARCH USING REGRESSION PROXY TASK

Abstract

Neural Architecture Search (NAS) has become a focus that has been extensively researched in recent years. Innovative achievements are yielded from the area like convolutional neural networks (CNN), recurrent neural networks (RNN) and so on. However, research on NAS for graph neural networks (GNN) is still in a preliminary stage. Because of the special structure of graph data, some conclusions drew from CNN cannot be directly applied to GNN. At the same time, for NAS, the models' ranking stability is of great importance for it reflects the reliability of the NAS performance. Unfortunately, little research attention has been paid to it, making it a pitfall in the development of NAS research. In this paper, we proposed a novel NAS pipeline, ReG-NAS, which balances stability, reliability and time cost to search the best GNN architecture. Besides, for the first time, we systematically analyzed factors that will affect models' ranking stability in a given search space, which can be used as a guideline for subsequent studies. Our codes are available at https://anonymous.4open.science/r/ReG-NAS-4D21 

1. INTRODUCTION

Graph neural networks (GNN) have received a lot of attention for their broad applications in social networks (Guo & Wang, 2020; Gao et al., 2021; Zhong et al., 2020) , molecule properties prediction (Shui & Karypis, 2020; Ma et al., 2020; Yang et al., 2021) , traffic prediction (Diehl et al., 2019; Bui et al., 2021; Zhang et al., 2021) and so on. With the goal of "faster and more accurate", people always have a pursuit to find a better structure of GNN. However, similar to neural networks like CNN and RNN, searching an ideal GNN architecture manually also is challenging. Neural architecture search (NAS) for GNN is absolutely a key point to the future development of GNN. To design a NAS architecture, an intuitive yet primitive idea is to enumerate all models in a given search space, and evaluate each model's performance according to the metric specified by the downstream task (Ying et al., 2019; Dong & Yang, 2019; You et al., 2020) . However, it is extremely timeconsuming and needs a huge amount of computational resources. To make NAS more efficient, several searching methods are proposed. Most GNN NAS architectures can be divided into five classes. (1) Reinforcement-learning-based methods (Zhou et al., 2019; Gao et al., 2020; Zhao et al., 2020a) , where these architectures have controllers defined as a neural network that dynamically change the parameters according to the evaluation of the performance of the generated model; (2) Bayesian-optimization-based methods (Yoon et al., 2020; Tu et al., 2019) , which builds a probability distribution over sampled candidates and uses a surrogate function to test; (3) Evolution-learningbased methods (Shi et al., 2022; Li & King, 2020) , among which the genetic algorithm is the most commonly used for GNN NAS frameworks (Oloulade et al., 2021) . (4) Differentiable-search-based methods (Zhao et al., 2020b; Huan et al., 2021; Ding et al., 2021; Li et al., 2021b; Cai et al., 2021) , which learns one or two blocks that are repeated in the whole neural network, and for GNN block is generally represented as a direct acyclic graph consisting of an ordered sequence of nodes. ( 5) Random-search-based methods (Gao et al., 2020; Zhao et al., 2020a; Tu et al., 2019) , which generates random submodels from the search space. However, these methods are still time consuming and can take hours to days. To reduce the search time, a popular way in NAS is to use a proxy-task, usually much smaller than the groundtruth task (e.g., Cifar10 is a proxy for ImageNet). The representativeness of the proxytask is crucial, i.e., how similar results can be obtained from proxy-task and from groundtruth task. One prevailing method to quantify the similarity between the two is using a "ranking correlation", which refers to the similarity between two rankings of all the networks' performance in the search space, usually uses Spearman's ρ and Kendall's τ as indicators (Abdelfattah et al., 2021; Liu et al., 2020; Zela et al., 2019) . The larger ρ and τ are, the more similar the two rankings are, meaning the more representative of the proxy task is. By achieving large ρ and τ between groundtruth ranking and prediction ranking, one can significantly improve NAS quality (Zhou et al., 2020; Chu et al., 2021; Li & Talwalkar, 2020) . A lot of zero-cost or few-shot NAS works have been proposed to design a good proxy, so that the ranking correlation between proxy and groundtruth can be high within only a few training steps (Mellor et al., 2021; Dey et al., 2021; Li et al., 2021a) . In addition, ranking correlation can be used to quantify the ranking stability of the networks by computing ρ and τ between different repetitions of the same training pipeline. Large ρ and τ imply that the variation of the model's relative ranking is small, i.e., the ranking is stable. We clarify two metrics that will be used hereafter: • Ranking Correlation: Correlation of two network rankings on two different tasks, e.g., ground truth task and proxy task, or proxy task one and proxy task two; quantified by ρ and τ . • Ranking Stability: Correlation of two rankings on the same task but of two repetitions, either the same or different initialization and training hyperparameters. Also quantified by ρ and τ . Observing the two metrics, we found an interesting phenomenon: in a GNN search space, the ranking stability for classification tasks can be much lower than for regression tasks. For classification groundtruth tasks, the ranking correlation between two repetitions of all GNN architectures in the same search space can be as low as 0.57, while for regression tasks, the ranking correlation between two repetitions can be up to 0.99. Inspired by this observation, together with a recent regressionbased proxy task in CNN, GenNAS (Li et al., 2021c) , we propose a self-supervised regressionbased proxy task for GNN NAS. We observe that using our proposed regression-based proxy task, both the ranking stability and ranking correlation are higher. In addition, regression-based proxy task converges faster than classification groundtruth task, thus reducing the search time. However, generate a representative proxy task is non-trivial. There is a rich study for proxy task generation or selection for DNN NAS but no related research on GNN NAS. The only regression-based DNN NAS work, GenNAS, can automatically search for a good proxy task, but requires knowing 20 architectures with groundtruth ranking (Li et al., 2021c) . This is a strong premise and still can be time consuming when targeting a new search space or dataset, i.e., at least 20 architectures must be well-trained and then ranked. To address the above challenges, we proposed a novel NAS method, using Reression-based proxy task for Graph Neural Architecture Search, ReG-NAS. We summarize our contributions as follows: • ReG-NAS is the first GNN NAS using regression-based proxy task. We propose a GNN NAS pipeline that can transform a groundtruth classification task to a regression proxy task, which leads to much higher ranking stability and faster convergence. • We systematically study the ranking stability and ranking correlation under various training environments, and uncover the fact that directly searching on classification groundtruth task is unreliable because of the low ranking stability. This observation challenges one common practice in NAS that, as long as the ranking correlation between proxy and groundtruth task is high, it is regarded as an effective proxy. • We propose a simple yet effective proxy task to guide GNN NAS, which does not require groundtruth labels but only one well-trained GNN model as proxy task generator. The generator is not necessarily the best GNN but can be any GNN within the search space. Using the proposed proxy task, we turn the groundtruth classification problem into regression, leading to much higher ranking stability and faster search.



In our work, we propose a proxy-task based GNN architecture search, aiming to reduce the GNN architecture search time. Similar to previous works in DNN architecture search Zhou et al. (2020); Chu et al. (2021); Li & Talwalkar (2020); Mellor et al. (2021); Dey et al. (2021); Li et al. (2021a), we also use ranking correlation to evaluate the performance of our proposed proxy-based NAS.

