TRANSFER LEARNING OF GRAPH NEURAL NETWORKS WITH EGO-GRAPH INFORMATION MAXIMIZATION

Abstract

Graph neural networks (GNNs) have been shown with superior performance in various applications, but training dedicated GNNs can be costly for large-scale graphs. Some recent work started to study the pre-training of GNNs. However, none of them provide theoretical insights into the design of their frameworks, or clear requirements and guarantees towards the transferability of GNNs. In this work, we establish a theoretically grounded and practically useful framework for the transfer learning of GNNs. Firstly, we propose a novel view towards the essential graph information and advocate the capturing of it as the goal of transferable GNN training, which motivates the design of EGI (ego-graph information maximization) to analytically achieve this goal. Secondly, we specify the requirement of structurerespecting node features as the GNN input, and conduct a rigorous analysis of GNN transferability based on the difference between the local graph Laplacians of the source and target graphs. Finally, we conduct controlled synthetic experiments to directly justify our theoretical conclusions. Extensive experiments on realworld networks towards role identification show consistent results in the rigorously analyzed setting of direct-transfering (freezing parameters), while those towards large-scale relation prediction show promising results in the more generalized and practical setting of transfering with fine-tuning.

1. INTRODUCTION

Graph neural networks (GNNs) have been intensively studied recently (Kipf & Welling, 2017; Keriven & Peyré, 2019; Chen et al., 2019; Oono & Suzuki, 2020; Huang et al., 2018) , due to their established performance towards various real-world tasks (Hamilton et al., 2017; Ying et al., 2018b; Velickovic et al., 2018) , as well as close connections to spectral graph theory (Defferrard et al., 2016; Bruna et al., 2014; Hammond et al., 2011) . While most GNN architectures are not very complicated, the training of GNNs can still be costly regarding both memory and computation resources on real-world large-scale graphs (Chen et al., 2018; Ying et al., 2018a) . Moreover, it is intriguing to transfer learned structural information across different graphs and even domains in settings like few-shot learning (Vinyals et al., 2016; Finn et al., 2017; Ravi & Larochelle, 2017) . Therefore, several very recent studies have been conducted on the transferability of GNNs, which focus on the setting of pre-training plus fine-tuning (Hu et al., 2019a (Hu et al., ,b, 2020;; Wu et al., 2020) . However, it is unclear in what situations the models will excel or fail especially when the pre-training and fine-tuning tasks are different. To provide rigorous analysis and guarantee on the transferability of GNNs, we focus on the setting of direct-transfering between the source and target graphs, under an analogous setting of "domain adaptation" (Ben-David et al., 2007) . In this work, we establish a theoretically grounded framework for the transfer learning of GNNs, and leverage it to design a practically transferable GNN model. Figure 1 gives an overview of our framework. It is based on a novel view of a graph as samples from the joint distribution of its k-hop ego-graph structures and node features, which allows us to define graph information and similarity, so as to analyze GNN transferability ( §2). This view motivates us to design EGI, a novel GNN model based on ego-graph information maximization, which is effective in capturing the graph information as we define ( §2.1). Then we further specify the requirement on transferable node features and analyze the transferability of EGI that is dependent on the local graph Laplacians of source and target graphs ( §2.2). All of our theoretical conclusions have been directly validated through controlled synthetic experiments (Table 1 ), where we use structural-equivalent role identification in a direct-transfering setting to analyze the impacts of different model designs, node features and source-target structure similarities on GNN transferability. In §3, we conduct real-world experiments on multiple publicly available network datasets. On the Airport and Gene graphs ( §3.1), we closely follow the settings of our synthetic experiments and observe consistent but more detailed results supporting the design of EGI and the utility of our theoretical analysis. On the YAGO graphs ( §3.2), we further evaluate EGI on the more generalized and practical setting of transfer learning with task-specific fine-tuning. We find our theoretical insights still indicative in such scenarios, where EGI consistently outperforms state-of-the-art GNN models and transfer learning frameworks with significant margins.

2. TRANSFERABLE GRAPH NEURAL NETWORKS

Based on the connection between GNN and spectral graph theory (Kipf & Welling, 2017), we describe the output of a GNN as a combination of its input node features, fixed graph Laplacian and learnable graph filters. The goal of training a GNN is then to improve its utility by learning the graph filters that are compatible with the other two components towards specific tasks. In the graph transfer learning setting where downstream tasks are often unknown during pre-training, we argue that the general utility of a GNN should be optimized and quantified w.r.t. its ability of capturing the essential graph information in terms of the joint distribution of its link structures and node features, which motivates us to design a novel ego-graph information maximization model (EGI) ( §2.1). The general transferability of a GNN is then quantified by the gap between its abilities to model the source and target graphs. Under reasonable requirements such as using structure-respecting node features as the GNN input, we analyze this gap for EGI based on the structural difference between two graphs w.r.t. their local graph Laplacians ( §2.2).

2.1. TRANSFERABLE GNN VIA EGO-GRAPH INFORMATION MAXIMIZATION

In this work, we focus on the direct-transfering setting where a GNN is pre-trained on a source graph G a in an unsupervised fashion and applied on a target graph G b without fine-tuning.foot_0 Consider a graph G = {V, E}, where the set of nodes V are associated with certain features and the set of links E form certain structures. Intuitively, the transfer learning will be successful only if both the features and structures of G a and G b are similar in some ways, so that the graph filters of a GNN learned on G a are compatible with the features and structures of G b .



In the experiments, we show our model to be generalizable to the more practical settings with task-specific pre-training and fine-tuning, while the study of rigorous bound in such scenarios is left as future work.



Figure 1: Overview of our GNN transfer learning framework: (1) we represent graph as a combination of its 1-hop ego-graph and node feature distributions; (2) we design a transferable GNN regarding the capturing of such essential graph information; (3) we establish a rigorous guarantee of GNN transferability based on the requirement on nodes features and difference between graph structures.

