REG-NAS: GRAPH NEURAL NETWORK ARCHITEC-TURE SEARCH USING REGRESSION PROXY TASK

Abstract

Neural Architecture Search (NAS) has become a focus that has been extensively researched in recent years. Innovative achievements are yielded from the area like convolutional neural networks (CNN), recurrent neural networks (RNN) and so on. However, research on NAS for graph neural networks (GNN) is still in a preliminary stage. Because of the special structure of graph data, some conclusions drew from CNN cannot be directly applied to GNN. At the same time, for NAS, the models' ranking stability is of great importance for it reflects the reliability of the NAS performance. Unfortunately, little research attention has been paid to it, making it a pitfall in the development of NAS research. In this paper, we proposed a novel NAS pipeline, ReG-NAS, which balances stability, reliability and time cost to search the best GNN architecture. Besides, for the first time, we systematically analyzed factors that will affect models' ranking stability in a given search space, which can be used as a guideline for subsequent studies. Our codes are available at https://anonymous.4open.science/r/ReG-NAS-4D21 

1. INTRODUCTION

Graph neural networks (GNN) have received a lot of attention for their broad applications in social networks (Guo & Wang, 2020; Gao et al., 2021; Zhong et al., 2020) , molecule properties prediction (Shui & Karypis, 2020; Ma et al., 2020; Yang et al., 2021) , traffic prediction (Diehl et al., 2019; Bui et al., 2021; Zhang et al., 2021) and so on. With the goal of "faster and more accurate", people always have a pursuit to find a better structure of GNN. However, similar to neural networks like CNN and RNN, searching an ideal GNN architecture manually also is challenging. Neural architecture search (NAS) for GNN is absolutely a key point to the future development of GNN. To design a NAS architecture, an intuitive yet primitive idea is to enumerate all models in a given search space, and evaluate each model's performance according to the metric specified by the downstream task (Ying et al., 2019; Dong & Yang, 2019; You et al., 2020) . However, it is extremely timeconsuming and needs a huge amount of computational resources. To make NAS more efficient, several searching methods are proposed. Most GNN NAS architectures can be divided into five classes. (1) Reinforcement-learning-based methods (Zhou et al., 2019; Gao et al., 2020; Zhao et al., 2020a) , where these architectures have controllers defined as a neural network that dynamically change the parameters according to the evaluation of the performance of the generated model; (2) Bayesian-optimization-based methods (Yoon et al., 2020; Tu et al., 2019) , which builds a probability distribution over sampled candidates and uses a surrogate function to test; (3) Evolution-learningbased methods (Shi et al., 2022; Li & King, 2020) , among which the genetic algorithm is the most commonly used for GNN NAS frameworks (Oloulade et al., 2021) . (4) Differentiable-search-based methods (Zhao et al., 2020b; Huan et al., 2021; Ding et al., 2021; Li et al., 2021b; Cai et al., 2021) , which learns one or two blocks that are repeated in the whole neural network, and for GNN block is generally represented as a direct acyclic graph consisting of an ordered sequence of nodes. (5) Random-search-based methods (Gao et al., 2020; Zhao et al., 2020a; Tu et al., 2019) , which generates random submodels from the search space. However, these methods are still time consuming and can take hours to days. To reduce the search time, a popular way in NAS is to use a proxy-task, usually much smaller than the groundtruth task (e.g., Cifar10 is a proxy for ImageNet). The representativeness of the proxytask is crucial, i.e., how similar results can be obtained from proxy-task and from groundtruth task. One prevailing method to quantify the similarity between the two is using a "ranking correlation", which refers to the similarity between two rankings of all the networks' performance in the search space, usually uses Spearman's ρ and Kendall's τ as indicators (Abdelfattah et al., 2021; Liu et al., 2020; Zela et al., 2019) . The larger ρ and τ are, the more similar the two rankings are, meaning the more representative of the proxy task is. By achieving large ρ and τ between groundtruth ranking and prediction ranking, one can significantly improve NAS quality (Zhou et al., 2020; Chu et al., 2021; Li & Talwalkar, 2020) . A lot of zero-cost or few-shot NAS works have been proposed to design a good proxy, so that the ranking correlation between proxy and groundtruth can be high within only a few training steps (Mellor et al., 2021; Dey et al., 2021; Li et al., 2021a) . In our work, we propose a proxy-task based GNN architecture search, aiming to reduce the GNN architecture search time. Similar to previous works in DNN architecture search Zhou et al. (2020) ; Chu et al. (2021) ; Li & Talwalkar (2020) ; Mellor et al. (2021) ; Dey et al. (2021) ; Li et al. (2021a) , we also use ranking correlation to evaluate the performance of our proposed proxy-based NAS. In addition, ranking correlation can be used to quantify the ranking stability of the networks by computing ρ and τ between different repetitions of the same training pipeline. Large ρ and τ imply that the variation of the model's relative ranking is small, i.e., the ranking is stable. We clarify two metrics that will be used hereafter: • Ranking Correlation: Correlation of two network rankings on two different tasks, e.g., ground truth task and proxy task, or proxy task one and proxy task two; quantified by ρ and τ . • Ranking Stability: Correlation of two rankings on the same task but of two repetitions, either the same or different initialization and training hyperparameters. Also quantified by ρ and τ . Observing the two metrics, we found an interesting phenomenon: in a GNN search space, the ranking stability for classification tasks can be much lower than for regression tasks. For classification groundtruth tasks, the ranking correlation between two repetitions of all GNN architectures in the same search space can be as low as 0.57, while for regression tasks, the ranking correlation between two repetitions can be up to 0.99. Inspired by this observation, together with a recent regressionbased proxy task in CNN, GenNAS (Li et al., 2021c) , we propose a self-supervised regressionbased proxy task for GNN NAS. We observe that using our proposed regression-based proxy task, both the ranking stability and ranking correlation are higher. In addition, regression-based proxy task converges faster than classification groundtruth task, thus reducing the search time. However, generate a representative proxy task is non-trivial. There is a rich study for proxy task generation or selection for DNN NAS but no related research on GNN NAS. The only regression-based DNN NAS work, GenNAS, can automatically search for a good proxy task, but requires knowing 20 architectures with groundtruth ranking (Li et al., 2021c) . This is a strong premise and still can be time consuming when targeting a new search space or dataset, i.e., at least 20 architectures must be well-trained and then ranked. To address the above challenges, we proposed a novel NAS method, using Reression-based proxy task for Graph Neural Architecture Search, ReG-NAS. We summarize our contributions as follows: • ReG-NAS is the first GNN NAS using regression-based proxy task. We propose a GNN NAS pipeline that can transform a groundtruth classification task to a regression proxy task, which leads to much higher ranking stability and faster convergence. • We systematically study the ranking stability and ranking correlation under various training environments, and uncover the fact that directly searching on classification groundtruth task is unreliable because of the low ranking stability. This observation challenges one common practice in NAS that, as long as the ranking correlation between proxy and groundtruth task is high, it is regarded as an effective proxy. • We propose a simple yet effective proxy task to guide GNN NAS, which does not require groundtruth labels but only one well-trained GNN model as proxy task generator. The generator is not necessarily the best GNN but can be any GNN within the search space. Using the proposed proxy task, we turn the groundtruth classification problem into regression, leading to much higher ranking stability and faster search.

2.1. GRAPH NEURAL NETWORKS

Graph is a kind of data structure that defines a set of nodes and their relationships (Waikhom & Patgiri, 2021) . A graph is a set of V nodes and a set of E edges, with optional labels y or features x attached to nodes, links, or the whole graph. Therefore, a graph can be represented as G = (V, E). Graph Neural Networks (GNN) is a type of Deep Neural Networks (DNN) that is suitable for analyzing graph-structured data. If we use x v and x uv to represent node v's and link (u, v)'s feature vectors, use h v and h uv to represent node v's and link (u, v)'s hidden representations in GNN, the GNN's message passing and update process can be described as: h ′ v = f node   h v , u∈N (v) h uv , x v   (1) h ′ uv = f edge (h u , h v , x uv ) Where N (v) denotes the number of in-neighbor nodes of node v, and f node and f edge are message passing functions that gather information from node's neighborhood and previous layers. GNN can be classified according to variants of graphs, downstream tasks, learning methods and so on. Main types of graphs includes undirected/directed graph, heterogeneous graph, dynamic graph, attributed graph and so on. Downstream tasks usually include classification task and regression task, each of which can be subdivided into graph-level, link-level and node-level problems. In our work, we mainly focus on undirected graphs with graph-level tasks.

2.2. GRAPH NEURAL ARCHITECTURE SEARCH AND GRAPHGYM

Graph Neural Architecture Search (GNN NAS) means automatically find the best GNN models for targeted tasks (Oloulade et al., 2021) . Like NAS for other neural networks, GNN NAS samples an architecture from a predefined search space, then the NAS network will evaluate the performance of sampled architecture as a feedback returned to the search algorithm. GNN NAS can be categorized according to search space, search algorithm (see Section 1) and performance evaluation (Oloulade et al., 2021) . Recent studies for GNN NAS (Zhou et al., 2019; Gao et al., 2020; Tu et al., 2019; Shi et al., 2022; Huan et al., 2021; Li et al., 2021b; You et al., 2020) have not only achieved promising performance for many applications of GNNs but also showed potential as a unanimous approach to constructing GNN models. Among all GNN NAS frameworks, we want to introduce GraphGym (You et al., 2020) in detail. In Graph-Gym, a GNN design space consists of 12 design dimensions for intra-layer design, inter-layer design and learning configuration. A single GNN layer has a sequence of modules:(1) Linear layer Therefore, the k-th GNN layer can be defined as: W (k) h (k) u +b (k) ; (2) batch normalization BN(•) (Ioffe & h (k+1) v = AGG ACT DROPOUT BN W (k) h (k) u + b (k) , u ∈ N (v) where h k) are trainable weights. For interlayer design, Graphgym gives three ways to connect GNN layers: STACK (directly stack multiple GNN layers) (Welling & Kipf, 2016; Velickovic et al., 2017) ; SKIP-SUM (residual connections) (He et al., 2016) and SKIP-CAT (concatenate embeddings in all previous layers) (Huang et al., 2017) . GraphGym also adds Multilayer Perceptron (MLP) layers before/after GNN message passing. Training Configurations includes batch size, learning rate, optimizer type and training epochs. The overview of GraphGym's design space and structure is shown in Fig 1 . In our works, we utilize GraphGym as a basic framework and propose our new NAS architecture by modifying it. (k) v is the k-th layer embeddings of node v, W (k) , b

2.3. GENERIC NEURAL ARCHITECTURE SEARCH VIA REGRESSION (GENNAS)

Generic Neural Architecture Search (GenNAS) is the a CNN and RNN based NAS framework that uses self-supervised regression proxy task instead of classification for NAS (Li et al., 2021c) . Compared to other NAS frameworks, it has several advantages: (1) By using regression as the selfsupervised proxy task, it is downstream-agnostic to the specific downstream tasks. (2) It has nearzero training cost, which is highly efficient for neural architecture search. Here we mainly focus on CNN regression architectures of GenNAS, as shown in Fig 2(a) . GenNAS constructs a Fully Convolutional Network (FCN) (Long et al., 2015) by removing the final classifier of a CNN, and then extract the FCN's intermediate feature maps from multiple stages. The number of stages is denoted as N . If the input tensor is I, each stage's feature map tensor is F i , the synthetic signal is F * i , the regression pipeline will reshape F i into Fi = M i (F i ), and compute MSE loss defined as L = N i=1 E[(F * i -Fi ) 2 ] during training process. GenNAS will rank models' performance according to final MSE values, and select the model with the lowest MSE value as the best. GenNAS uses Ranking Correlation (See Section 1) for NAS evaluation.

3.1. THE BARRIERS TO GENERATE PROXY TASK

As mentioned in Section 1, ReG-NAS uses a regression based proxy task to search for GNN structures. However, different from grid-data like images, which can use the combination of signals with different frequency but the same shape (i.e., data dimension) as proxy task, generate proxy task for GNN is much harder. First, even in the same dataset, different graphs usually have different topology structures. Therefore, for graph datasets, we should generate proxy task for each graph individually, and there is no socalled "global" signal (Li et al., 2021c) in the process of generating proxy task. Second, although according to spectral graph theory (Shuman et al., 2013) , any graph signal can be projected on the eigenvectors of the Laplacian Matrix L and the "frequency" of each eigenvector is the corresponding eigenvalue, we cannot simply use these vectors as our proxy task. For graphs, Laplacian Matrix only contains a graph's structural information, while many graphs also have other non-structural information such as node features and edge features. Direct linear combination of Laplacian Matrix's eigenvectors will lose graphs' original information and weaken NAS performance. Therefore, in our generating process, we should aggregate both structural and non-structural information into proxy task.

3.2. PROPOSED REG-NAS PIPELINE

In ReG-NAS, we propose a simple, yet effective proxy task generator for graph datasets, as shown in Fig 2(b ). In a given search space with number of post-process layers equals to 2, we first randomly select a GNN architecture, then use this architecture to train the dataset for k epochs. In the end we use this well-trained model for inference, and extract model's hidden node feature from the first Post-process layer as our proxy task F p . The reason why we set the number of Post-process layers into 2 is to make sure that all graphs' proxy tasks are in the same shape. For example, if the input graph G's original node feature is I ∈ R n×d0 (n is the number of nodes), and in the message passing layer i, node feature is F i ∈ R n×d1 , the first post-process layer will reshape F i into F p ∈ R 1×dp , and F p is the proxy task for graph. The first post-process layer uniformly converts node feature's The way we generate proxy task does not need to know model's performance ranking, even a subset of the search space. As the final goal of NAS is to rank GNN's relative performance, the generator's performance wouldn't affect the final result as long as the generated proxy task is informative (See Section 4.1.2). In fact, from the experiments discussed later, we will find that the selection of model has little effect on the final results. This is different from the way that GenNAS did (Li et al., 2021c) : In GenNAS, before generating proxy task, we need to know the relative performance ranking of a subset (e.g 20) of the neural architectures in NAS search space, which is a strong assumption and thus is not practical when being applied to new datasets or tasks. ℱ 3 ℱ 2 Classification Task ℱ 1 ℱ 1 = 𝑀 1 (ℱ 1 ) ℱ 2 = 𝑀 2 (ℱ 2 ) ℱ 3 = 𝑀 3 (ℱ 3 ) ℱ 1 -ℱ 1 * 2 ℱ 2 -ℱ 2 * 2 ℱ 3 -ℱ 3 * -ℱ 1 * ℱ 1 -ℱ 2 * ℱ 2 -ℱ 3 * ℱ 3 Model being evaluated CNN 𝑒 Stage 1 Stage 2 Stage 3 ℱ 1 ∈ ℝ 𝑏×𝑐 1 ′ ×ℎ ′ ×𝑤 ′ ℱ 2 ∈ ℝ 𝑏×𝑐 2 ′ ×ℎ ′ ×𝑤 ′ ℱ 3 ∈ ℝ 𝑏×𝑐 3 ′ ×ℎ ′ ×𝑤 ′ ℱ 1 ∈ ℝ 𝑏×𝑐1×ℎ1×𝑤1 ℱ 2 ∈ ℝ 𝑏×𝑐2×ℎ2×𝑤2 ℱ 3 ∈ ℝ 𝑏×𝑐3×ℎ3×𝑤3 ℱ 2 ℱ 1 ℱ 3 ℱ 2 = 𝑀(ℱ 2 ) ℱ 3 = 𝑀(ℱ 3 ) data.num_nodes=n ℱ 1 -ℱ 𝑝 2 ℱ 2 -ℱ 𝑝 2 ℱ 3 -ℱ 𝑝 2 + ℒ Pre MP After the proxy task is generated, we will attach it to the graph dataset. In the regression training process, ReG-NAS will extract node feature F i from each Message Passing layer, and reshape it into Fi ∈ R 1×dp in order to match the shape of F p . We use pooling method (SUM, MEAN, MAX) to reshape F i . The evaluation metric is the final MSE loss between Fi and F p . In the end we will compute Ranking Coefficient (Spearman's ρ and Kendall's τ ) between proxy (regression-based) ranking and groundtruth (classification-based or regression-based) ranking. In summary, the whole process of ReG-NAS contains several steps below: 1. Randomly select a GNN model from the search space as proxy task generator GNN p ; 2. Generate proxy task F p ∈ R 1×dp ; 3. Select a model to be evaluated GNN e from the search space; 4. Extract Message Passing layer's node feature F i ∈ R n×d1 , reshape it into Fi ∈ R 1×dp , compute MSE loss L = N i=1 E[( Fi -F p ) 2 ]; 5. Evaluate and rank model's performance according to final L value. To fully evaluate the performance of ReG-NAS as well as analyze the factors that will affect GNN ranking stability, we conduct 3 types of experiments: Ranking stability analysis, Effectiveness evaluation and Efficiency evaluation. In these experiments we use ogbg-molhiv (Hu et al., 2020) as classification-based dataset, and use ZINC (Gómez-Bombarelli et al., 2018) as regression-based dataset. The basic information of these datasets and hyper-parameter configurations are listed in Table 1 . Our search space contains 216 GNN models, as shown in Table 2 . All hyper-parameters (base learning rate, optimizer, batch size etc.) are optimized to ensure that the model converges at an optimal rate. In our experiment, the learning rate is annealed via cosine decay to 0 in order to reduce the variance between multiple independent training runs (Loshchilov & Hutter, 2016) . And for ZINC, we use its subset which contains 12,000 graphs with 10,000 train graphs, 1,000 test graphs and 1,000 validation graphs in our experiment to reduce training cost. The reason why we set proxy training epochs equals to 80 is that we find that all models' loss converges at about 80 epochs, thus there's no need to train extra 20 epochs.

4.1. GNN RANKING STABILITY ANALYSIS

In this section we aim to find factors that will affect GNN ranking stability and try to evaluate the stability of our proposed NAS pipeline. Therefore, we first test GNN ranking stability on two groundtruth task (classification task on ogbg-molhiv, regression task on ZINC), then we test ranking stability on our NAS pipeline. At the same time, we will also discuss how proxy task will affect the proxy ranking stability. For a single experiment, we repeat the training and evaluation of all architectures 3 times, then compute Ranking Stability among them. For different repetitions of the same task, their initialization is different. The Ranking Stability analysis between two repetitions of the same experiment are shown in Fig 3 . The x-axis represents the ranking of ith experiment, and the y-axis represents the ranking of jth experiment. For example, if a model ranks 3rd in the ith experiment and ranks 5th in the jth experiment, then the model's coordinate is (3, 5). We make Ranking Stability analysis for every 2 experiments and place all points on the same figure, represented as a heat map. The heat map will show the density of the points and therefore reveals the pattern of task's Ranking Stability for a training pipeline with a given configuration. We also compute ρ and τ from the heat map.

4.1.1. GROUNDTRUTH RANKING AND PROXY RANKING STABILITY ANALYSIS

In the situation when we don't modify ranking metrics on groundtruth ranking and proxy ranking, and use proposed method to generate proxy task, as shown in Fig 3 (a)-3(d), from the result we can clearly see that regression-based ranking are more stable than classification-based ranking, despite using groundtruth training pipeline or proxy training pipeline. This phenomenon may due to the choice of downstream task as well as ranking metric. For classification task, in the training process, the variable being directly optimized is loss value, while the final ranking metric is ROC-AUC (for ogbg-molhiv); At the same time, for regression task, in the training process the variable being directly optimized is still loss value, but the final ranking metric is MAE (for ZINC). An intuitive explanation is that, network performance for regression-based task is evaluated directly on the regression loss; the network performance for classification based task, on the other hand, is evaluated on ROC-AUC, which is an indirect metric. To further validate our hypothesis, based on ogbg-molhiv, we rank models according to their final loss value, and analyze Ranking Stability, as shown in Fig 3(e) . From the result we can find that compared to "ROC-AUC-based" ranking, "loss-based" ranking's ρ and τ are much better, which affirms our hypothesis.

4.1.2. THE RELATIONSHIP BETWEEN THE CHOICE OF PROXY TASK AND PROXY TASK'S RANKING STABILITY

Although all the experiments mentioned above illustrate regression-based rankings are more stable, this may not be applicable to all cases. For example, if we use randomly generated vectors as our proxy task and use it to rank models, we will find it's ρ is only 0.424, even it is a regression-based ranking, as Fig 3(f) shows. From the results we can find many points fall in the upper left and lower right corners of the figure, which means that the relative rankings of these models differ a lot in the two repeated experiments. This is an anomaly even for classification cases. To explain this phenomenon, we should know the way how the Deep Learning optimization works. Deep Learning optimization is trying to globally optimize a function by using local gradient information (Bottou & Bousquet, 2007) , which means if a learning problem is characterized by noninformative gradients, then no deep learning architecture will be able to learn it. Back to the topic, clearly random-generated vectors cannot provide informative gradients, due to which it is not surprising that a model's ranking changes drastically in different repetitions. Therefore, before we draw a conclusion that regression-based rankings are more stable, it should based on a premise that the learning problem should be informative or reasonable. (3) PM-based pipeline. For each groundtruth task and proxy task, we repeat the experiment three times, and compute the average value of Spearman's ρ and Kendall's τ among them as final results, as shown in Table 3 . (3) The choice of models as proxy task generator has little effects on the final results, regardless the type of groundtruth task. For ogbg-molhiv, the difference between the ρ of GM-based pipeline and PM-based pipeline is 0.011, which is in the fluctuation range of ρ (See Section 4.2.2); For ZINC, the PM-based pipeline's ρ is even higher than GM-based pipeline's, which further proves that we don't have to select "Golden" model as task generator. To sum up, ReG-NAS is downstream-agnostic (applicable on both classification task and regression task), stable, effective and efficient GNN NAS architecture. By using ReG-NAS, we can approximate GNN's relative performance in a short time, which is especially useful for searching GNN in a large search space. 4.2 REG-NAS PERFORMANCE EVALUATION 4.2.1 EFFECTIVENESS OF REG-NAS MP MP MP ℱ 1 ∈ ℝ 𝑛×𝑑 1 ℱ 1 ∈ ℝ 1×𝑑 𝑝 ℱ 1 ∈ ℝ 𝑛×1 ℱ 2 ∈ ℝ 𝑛×𝑑 1 Others LE ℱ 2 ∈ ℝ 1×𝑑 𝑝 ℱ 2 ∈ ℝ 𝑛×1

5. CONCLUSION

In 



Batch size, Learning rate, Optimizer, Training epochs

Figure 1: Structure of and Design Space of GNN

Szegedy, 2015);(3) dropout operation DROPOUT(•) (Srivastava et al., 2014); (4) nonlinear activation function ACT(•); (5) aggregation function AGG(•).

Figure 2: Overview of GenNAS structure (a) and ReG-NAS structure (b). For ReG-NAS, this figure only shows the cases when groundtruth task is classification. In fact, ReG-NAS is also applicable for regression-based groundtruth task, which is almost the same as classification-based groundtruth task architecture searching pipeline.

Figure 3: Groundtruth task's and Proxy task's Ranking Stability analysis. (a) Classificationbased groundtruth task's Ranking Stability on ogbg-molhiv (Use ROC-AUC as ranking metric); (b) Regression-based task's Ranking Stability on ZINC (Use MAE as ranking metric); (c) Proxy task's Ranking Stability on ogbg-molhiv (Use MSE as ranking metric); (d) Proxy task's Ranking Stability on ZINC (Use MSE as ranking metric); (e) Classification-based groundtruth task's Ranking Stability on ogbg-molhiv (Use Loss as ranking metric); (f) Proxy task's Ranking Stability on ogbg-molhiv (Use random-generated vectors as proxy task)

Figure 5: ReG-NAS convergence speed analysis. (a) Spearman's ρ convergence curve, ρ is the Ranking Correlation between proxy ranking at epoch n and final groundtruth ranking at epoch 100. (b) ReG-NAS speedup compared to Groundtruth training time.

Basic information and hyper-parameter configurations for ogbg-molhiv and ZINC

GNN Search Space for experiments

As mentioned before, for proxy training pipeline, the learning problem should be informative and reasonable (Section 4.1.2). Therefore, Random-generated vectors cannot be used as proxy task. Meanwhile, from Section 3.1 we know that Laplacian Matrix's eigenvectors doesn't contain graph's non-structural information, which is the key reason to the poor performance of LE pipeline. (2) RM/GM/PM-based pipeline reached an ideal Spearman's ρ and Kendall's τ , which can be used to approximate the performance of GNN structure.

this work, we proposed ReG-NAS, a GNN NAS architecture which uses regression proxy task. It has several advantages: (1) Stable. The Ranking Stability (Spearman's ρ) between two repetitions can reach up to 0.99; (2) Downstream-agnostic. It is applicable to both classification task and regression task; (3) Effective. It has high proxy-groundtruth ranking similarity which can be work as a reference of GNN's relative performance; (4) Efficient. Compared to traditional NAS searching method, it can save up to 76.2% of training time. At the same time, for the first time, we analyzed the factors that will affect GNN ranking stability, which provides a new insight of designing a stable GNN.

