METAGL: EVALUATION-FREE SELECTION OF GRAPH LEARNING MODELS VIA META-LEARNING

Abstract

Given a graph learning task, such as link prediction, on a new graph, how can we select the best method as well as its hyperparameters (collectively called a model) without having to train or evaluate any model on the new graph? Model selection for graph learning has been largely ad hoc. A typical approach has been to apply popular methods to new datasets, but this is often suboptimal. On the other hand, systematically comparing models on the new graph quickly becomes too costly, or even impractical. In this work, we develop the first meta-learning approach for evaluation-free graph learning model selection, called METAGL, which utilizes the prior performances of existing methods on various benchmark graph datasets to automatically select an effective model for the new graph, without any model training or evaluations. To quantify similarities across a wide variety of graphs, we introduce specialized meta-graph features that capture the structural characteristics of a graph. Then we design G-M network, which represents the relations among graphs and models, and develop a graph-based meta-learner operating on this G-M network, which estimates the relevance of each model to different graphs. Extensive experiments show that using METAGL to select a model for the new graph greatly outperforms several existing meta-learning techniques tailed for graph learning model selection (up to 47% better), while being extremely fast at test time (∼1 sec).

1. INTRODUCTION

Given a graph learning (GL) task, such as link prediction, for a new graph dataset, how can we select the best method as well as its hyperparameters (HPs) (collectively called a model) without performing any model training or evaluations on the new graph? GL has received increasing attention recently (Zhang et al., 2022) , achieving successes across various applications, e.g., recommendation and ranking (Fan et al., 2019; Park et al., 2020) , traffic forecasting (Jiang & Luo, 2021) , bioinformatics (Su et al., 2020) , and question answering (Park et al., 2022) . However, as GL methods continue to be developed, it becomes increasingly difficult to determine which model to use for the given graph. Model selection (i.e., selecting a method and its configuration such as HPs) for graph learning has been largely ad hoc to date. A typical approach, called "no model selection", is to simply apply popular methods to new graphs, often with the default HP values. However, it is well known that there is no universal learning algorithm that performs best on all problem instances (Wolpert & Macready, 1997) , and such consistent model selection is often suboptimal. At the other extreme lies "naive model selection" (Fig. 1b ), where all candidate models are trained on the new graph, evaluated on a hold-out validation graph, and then the best performing model for the new graph is selected. This approach is very costly as all candidate models are trained when a new graph arrives. Recent methods on neural architecture search (NAS) and hyperparameter optimization (HPO) of GL methods, which we review in Section 3, adopt smarter and more efficient strategies, such as Bayesian optimization (Snoek et al., 2012; Tu et al., 2019) , which carefully choose a relatively small number of HP settings to evaluate. However, they still need to evaluate multiple configurations of each GL method on the new graph. Evaluation-free model selection is yet another paradigm, which aims to tackle the limitations of the above approaches by attempting to simultaneously achieve the speed of no model selection and the accuracy of exhaustive model selection. The high-level idea of meta-learning based model selection is to estimate a candidate model's performance on the new graph based on its observed performances on similar graphs. Our meta-learning problem for graph data presents a unique challenge of how to model graph similarities, and what characteristic features (i.e., meta-features) of a graph to consider. Note that this step is often not needed for traditional meta-learning problems on non-graph data, as features for non-graph objects (e.g., location, age of users) are often readily available. Also, the high complexity and irregularity of graphs (e.g., different number of nodes and edges, and widely varying connectivity patterns among different graphs) makes the task even more challenging. To handle these challenges, we design specialized meta-graph features that can characterize major structural properties of real-world graphs. Then to estimate the performance of a candidate model on a given graph, METAGL learns to embed models and graphs in the shared latent space such that their embeddings reflect the graph-to-model affinity. Specifically, we design a multi-relational graph called G-M network, which captures multiple types of relations among models and graphs, and develop a meta-learner operating on this G-M network, based on an attentive graph neural network that is optimized to leverage meta-graph features and prior model performance into producing model and graph embeddings that can be effectively used to estimate the best performing model for the new graph. METAGL greatly outperforms existing metalearners in GL model selection (Fig. 1c ). In sum, the key contributions of this work are as follows. • Problem Formulation. We formulate the problem of selecting effective GL models in an evaluation-free manner (i.e., without ever having to train/evaluate any model on the new graph), To the best of our knowledge, we are the first to study this important problem. • Meta-Learning Framework and Features. We propose METAGL, the first meta-learning framework for evaluation-free GL model selection. For meta-learning on various graphs, we design metagraph features that quantify graph similarities by capturing structural characteristics of a graph. • Effectiveness. Using METAGL for GL model selection greatly outperforms existing meta-learning techniques (up to 47% better, Fig. 1c ), with negligible runtime overhead at test time (∼1 sec). Benchmark Data/Code: To facilitate further research on this important new problem, we release code and data at https://github.com/NamyongPark/MetaGL, including performances of 400+ models on 300+ graphs, and 300+ meta-graph features.



Recently, a seminal work by Zhao et al. (2021) proposed a technique for outlier detection (OD) model selection, which carries over the observed performance of MetaGL infers the best model with no model training/evaluation.Figure 1: (a) Given an unseen graph G and a large space M of models to search over, METAGL efficiently infers the best model M * ∈ M without having to train a single model from M on the new graph G. (b) Existing approaches, by contrast, need to train and evaluate multiple models M i ∈ M to be able to select the best one. (c) Given observed performance with varying sparsity levels, METAGL consistently outperforms existing meta-learners, with up to 47% better model selection performance.OD methods on benchmark datasets for selecting OD methods. However, it does not address the unique challenges of GL model selection, and cannot be directly used to solve the problem. Inspired by this work, we systematically tackle the model selection problem for graph learning, especially

