TUNEUP: A TRAINING STRATEGY FOR IMPROVING GENERALIZATION OF GRAPH NEURAL NETWORKS Anonymous authors Paper under double-blind review

Abstract

Despite many advances in Graph Neural Networks (GNNs), their training strategies simply focus on minimizing a loss over nodes in a graph. However, such simplistic training strategies may be sub-optimal as they neglect that certain nodes are much harder to make accurate predictions on than others. Here we present TUNEUP, a curriculum learning strategy for better training GNNs. Crucially, TUNEUP trains a GNN in two stages. The first stage aims to produce a strong base GNN. Such base GNNs tend to perform well on head nodes (nodes with large degrees) but less so on tail nodes (nodes with small degrees). So, the second stage of TUNEUP specifically focuses on improving prediction on tail nodes. Concretely, TUNEUP synthesizes many additional supervised tail node data by dropping edges from head nodes and reusing the supervision on the original head nodes. TUNEUP then minimizes the loss over the synthetic tail nodes to finetune the base GNN. TUNEUP is a general training strategy that can be used with any GNN architecture and any loss, making TUNEUP applicable to a wide range of prediction tasks. Extensive evaluation of TUNEUP on five diverse GNN architectures, three types of prediction tasks, and both inductive and transductive settings shows that TUNEUP significantly improves the performance of the base GNN on tail nodes, while often even improving the performance on head nodes, which together leads up to 58.5% relative improvement in GNN predictive performance. Moreover, TUNEUP significantly outperforms its variants without the two-stage curriculum learning, existing graph data augmentation techniques, as well as other specialized methods for tail nodes.

1. INTRODUCTION

Graph Neural Networks (GNNs) are one of the most successful and widely used paradigms for representation learning on graphs, achieving state-of-the-art performance in a variety of prediction tasks, such as semi-supervised node classification (Kipf & Welling, 2017; Velickovic et al., 2018 ), link prediction (Hamilton et al., 2017; Kipf & Welling, 2016) , and recommender systems (Ying et al., 2018; He et al., 2020) . There has been a surge of work on improving GNN model architectures (Velickovic et al., 2018; Xu et al., 2019; 2018; Shi et al., 2020; Klicpera et al., 2019; Wu et al., 2019; Zhao & Akoglu, 2019; Li et al., 2019; Chen et al., 2020; Li et al., 2021) and task-specific losses (Kipf & Welling, 2016; Rendle et al., 2012; Verma et al., 2021; Huang et al., 2021) . Despite all these advances, strategies for training a GNN on a given supervised loss remain largely simplistic. Existing work has focused on simply minimizing the given loss over nodes in a graph. While such a simplistic default strategy already gives a strong performance, the strategy may still be sub-optimal as it neglects that some nodes are much harder to make accurate predictions on than others. Consequently, a GNN trained with the default strategy may significantly under-perform on those hard nodes, resulting in overall sub-optimal predictive performance. Here we present TUNEUP to better train a GNN on a given supervised loss. The key motivation behind TUNEUP is that GNNs tend to under-perform on tail nodes, i.e., nodes with a small number of neighbors (Liu et al., 2021) . In practice, performing well on tail nodes is important since they are prevalent in real-world scale-free graphs (Clauset et al., 2009) and newly-arriving cold-start nodes (Lika et al., 2014) . To better train a GNN on those hard-to-predict tail nodes, the key idea of TUNEUP is to use a curriculum learning strategy (Bengio et al., 2009) ; TUNEUP first trains a GNN

