EMPOWERING GRAPH REPRESENTATION LEARNING WITH TEST-TIME GRAPH TRANSFORMATION

Abstract

As powerful tools for representation learning on graphs, graph neural networks (GNNs) have facilitated various applications from drug discovery to recommender systems. Nevertheless, the effectiveness of GNNs is immensely challenged by issues related to data quality, such as distribution shift, abnormal features and adversarial attacks. Recent efforts have been made on tackling these issues from a modeling perspective which requires additional cost of changing model architectures or re-training model parameters. In this work, we provide a data-centric view to tackle these issues and propose a graph transformation framework named GTRANS which adapts and refines graph data at test time to achieve better performance. We provide theoretical analysis on the design of the framework and discuss why adapting graph data works better than adapting the model. Extensive experiments have demonstrated the effectiveness of GTRANS on three distinct scenarios for eight benchmark datasets where suboptimal data is presented. Remarkably, GTRANS performs the best in most cases with improvements up to 2.8%, 8.2% and 3.8% over the best baselines on three experimental settings. Code is released at https://github.com/ChandlerBang/GTrans.

1. INTRODUCTION

Graph representation learning has been at the center of various real-world applications, such as drug discovery (Duvenaud et al., 2015; Guo et al., 2022) , recommender systems (Ying et al., 2018; Fan et al., 2019; Sankar et al., 2021) , forecasting (Tang et al., 2020; Derrow-Pinion et al., 2021) and outlier detection (Zhao et al., 2021a; Deng & Hooi, 2021) . In recent years, there has been a surge of interest in developing graph neural networks (GNNs) as powerful tools for graph representation learning (Kipf & Welling, 2016a; Veličković et al., 2018; Hamilton et al., 2017; Wu et al., 2019) . Remarkably, GNNs have achieved state-of-the-art performance on numerous graph-related tasks including node classification, graph classification and link prediction (Chien et al., 2021; You et al., 2021; Zhao et al., 2022b) . Despite the enormous success of GNNs, recent studies have revealed that their generalization and robustness are immensely challenged by the data quality (Jin et al., 2021b; Li et al., 2022) . In particular, GNNs can behave unreliably in scenarios where sub-optimal data is presented: 1. Distribution shift (Wu et al., 2022a; Zhu et al., 2021a) . GNNs tend to yield inferior performance when the distributions of training and test data are not aligned (due to corruption or inconsistent collection procedure of test data). 2. Abnormal features (Liu et al., 2021a) . GNNs suffer from high classification errors when data contains abnormal features, e.g., incorrect user profile information in social networks. 3. Adversarial structure attack (Zügner et al., 2018; Li et al., 2021) . GNNs are vulnerable to imperceptible perturbations on the graph structure which can lead to severe performance degradation. To tackle these problems, significant efforts have been made on developing new techniques from the modeling perspective, e.g., designing new architectures and employing adversarial training strategies (Xu et al., 2019; Wu et al., 2022a) . However, employing these methods in practice may be  [⋅] [⋅] [⋅] [⋅] [⋅] [⋅] [⋅] [x] [⋅] [x] [⋅] [⋅] [x] [⋅] [x] Node feature [⋅] [⋅] Test Accuracy GCN: 44.3% GAT: 21.2% APPNP: 48.3% AirGNN: 58.5% [x] Node connection Original Graph [⋅] [⋅] [x] [⋅] [x] [⋅] [⋅] [x] [⋅] [⋅] [⋅] [x] [⋅] [x] [⋅] [⋅] Refined Graph Figure 1 : We study the test-time graph transformation problem, which seeks to learn a refined graph such that pre-trained GNNs can perform better on the new graph compared to the original. Shown: An illustration of our proposed approach's empirical performance on transforming a noisy graph. infeasible, as they require additional cost of changing model architectures or re-training model parameters, especially for well-trained large-scale models. The problem is further exacerbated when adopting these techniques for multiple architectures. By contrast, this paper seeks to investigate approaches that can be readily used with a wide variety of pre-trained models and test settings for improving model generalization and robustness. Essentially, we provide a data-centric perspective to address the aforementioned issues by modifying the graph data presented at test-time. Such modification aims to bridge the gap between training data and test data, and thus enable GNNs to achieve better generalization and robust performance on the new graph. Figure 1 visually describes this idea: we are originally given with a test graph with abnormal features where multiple GNN architectures yield poor performance; however, by transforming the graph prior to inference (at test-time), we enable these GNNs to achieve much higher accuracy. In this work, we aim to develop a data-centric framework that transforms the test graph to enhance model generalization and robustness, without altering the pre-trained model. In essence, we are faced with two challenges: (1) how to model and optimize the transformed graph data, and (2) how to formulate an objective that can guide the transformation process. First, we model the graph transformation as injecting perturbation on the node features and graph structure, and optimize them alternatively via gradient descent. Second, inspired by the recent progress of contrastive learning, we propose a parameter-free surrogate loss which does not affect the pre-training process while effectively guiding the graph adaptation. Our contributions can be summarized as follows: 1. For the first time, we provide a data-centric perspective to improve the generalization and robustness of GNNs with test-time graph transformation. 2. We establish a novel framework GTRANS for test-time graph transformation by jointly learning the features and adjacency structure to minimize a proposed surrogate loss. 3. Our theoretical analysis provides insights on what surrogate losses we should use during test-time graph transformation and sheds light on the power of data-adaptation over model-adaptation. 4. Extensive experimental results on three settings (distribution shift, abnormal features and adversarial structure attacks) have demonstrated the superiority of test-time graph transformation. Particularly, GTRANS performs the best in most cases with improvements up to 2.8%, 8.2% and 3.8% over the best baselines on three experimental settings. Moreover, we note: (1) GTRANS is flexible and versatile. It can be equipped with any pre-trained GNNs and the outcome (the refined graph data) can be deployed with any model given its favorable transferability. (2) GTRANS provides a degree of interpretability, as it can show which kinds of graph modifications can help improve performance by visualizing the data.

2. RELATED WORK

Distribution shift in GNNs. GNNs have revolutionized graph representation learning and achieved state-of-the-art results on diverse graph-related tasks (Kipf & Welling, 2016a; Veličković et al., 2018; Hamilton et al., 2017; Chien et al., 2021; Klicpera et al., 2018; Wu et al., 2019) . However, recent studies have demonstrated that GNNs yield sub-optimal performance on out-of-distribution data for node classification (Zhu et al., 2021a; Wu et al., 2022a; Liu et al., 2022a) and graph classification (Chen et al., 2022; Buffelli et al., 2022; Gui et al., 2022; Wu et al., 2022b; You et al., 2023) . These studies have introduced solutions to tackle distribution shifts by altering model training behavior or model architectures. For a thorough review, we refer the readers to a recent survey (Li et al., 2022) . Unlike existing works, we target modifying the inputs via test-time adaption.

