TOWARDS RELIABLE LINK PREDICTION WITH ROBUST GRAPH INFORMATION BOTTLENECK

Abstract

Link prediction on graphs has achieved great success with the rise of deep graph learning. However, the potential robustness under the edge noise is less investigated. We reveal that the inherent edge noise that naturally perturbs both input topology and target label leads to severe performance degradation and representation collapse. In this work, we propose an information-theory-guided principle, Robust Graph Information Bottleneck (RGIB), to extract reliable supervision signals and avoid representation collapse. Different from the general information bottleneck, RGIB decouples and balances the mutual dependence among graph topology, target labels, and representation, building new learning objectives toward robust representation. We also provide two instantiations, RGIB-SSL and RGIB-REP, which benefit from different methodologies, i.e., self-supervised learning and data reparametrization, for implicit and explicit data denoising, respectively. Extensive experiments on 6 benchmarks of various scenarios verify the effectiveness of the proposed RGIB.

1. INTRODUCTION

As a fundamental problem in graph learning, link prediction (Liben-Nowell & Kleinberg, 2007) has attracted growing interest in real-world applications like drug discovery (Ioannidis et al., 2020) , knowledge graph completion (Bordes et al., 2013) , and question answering (Huang et al., 2019) . Recent advances from heuristic designs (Katz, 1953; Page et al., 1999) to graph neural networks (GNNs) (Kipf & Welling, 2016a; Gilmer et al., 2017; Kipf & Welling, 2016b; Zhang & Chen, 2018; Zhu et al., 2021) have achieved superior performances. Nevertheless, the poor robustness in imperfect scenarios with the inherent edge noise is still a practical bottleneck to the current deep graph models (Gallagher et al., 2008; Ferrara et al., 2016; Wu et al., 2022a; Dai et al., 2022) . Early explorations improve the robustness of GNNs for node classification under label noise (Dai et al., 2021; Li et al., 2021) through the smoothing effect of neighboring nodes. Other methods achieve a similar goal via randomly removing edges (Rong et al., 2020) or actively selecting the informative nodes or edges and pruning the task-irrelevant ones (Zheng et al., 2020; Luo et al., 2021) . However, when applying these noise-robust methods to the link prediction with noise, only marginal improvements are achieved (see Section 5). The attribution is that the edge noise can naturally deteriorate both the input topology and the target labels (Figure 1(a) ). Previous works that consider the noise either in input space or label space cannot effectively deal with such a coupled scenario. Therefore, it raises a new challenge to understand and tackle the edge noise for robust link prediction. In this paper, we dive into the inherent edge noise and empirically show the significantly degraded performances it leads to (Section 3.1). Then, we reveal the negative effect of the edge noise through carefully inspecting the distribution of learned representations, and discover that graph representation is severely collapsed, which is reflected by much lower alignment and poorer uniformity (Section 3.2). To solve this challenging problem, we propose the Robust Graph Information Bottleneck (RGIB) principle based on the basic GIB for adversarial robustness (Wu et al., 2020) (Section 4.1). Conceptually, the RGIB principle is with new learning objectives that decouple the mutual information (MI) among noisy inputs Ã, noisy labels Ỹ , and the representation H. As illustrated in Figure 1(b) , RGIB generalizes the basic GIB to learn a robust representation that is resistant to the edge noise. Technically, we provide two instantiations of RGIB based on different methodologies, i.e., RGIB-SSL and RGIB-REP: (1) the former utilizes contrastive pairs with automatically augmented views to form the informative regularization in a self-supervised learning manner (Section 4.2); and (2) the latter explicitly purifies the graph topology and supervision targets with the reparameterization mechanism (Section 4.3). Both instantiations are equipped with adaptive designs, aiming to effectively estimate and balance the corresponding informative terms in a tractable manner. For example, the hybrid augmentation algorithm and self-adversarial alignment loss for RGIB-SSL, and the relaxed information constraints on topology space as well as label space for RGIB-REP. Empirically, we show that these two instantiations work effectively under extensive noisy scenarios and can be seamlessly integrated with various existing GNNs (Section 5). Our main contributions are summarized as follows. • To our best knowledge, we are the first to study the robustness problem of link prediction under the inherent edge noise. We reveal that the inherent noise can bring a severe representation collapse and performance degradation, and such negative impacts are general to common datasets and GNNs. • We propose a general learning framework, RGIB, with refined representation learning objectives to promote the robustness of GNNs. Two instantiations, RGIB-SSL and RGIB-REP, are proposed upon different methodologies that are equipped with adaptive designs and theoretical guarantees. • Without modifying the GNN architectures, the RGIB achieves state-of-the-art results on 3 GNNs and 6 datasets under various noisy scenarios, obtaining up to 12.9% AUC promotion. The distribution of learned representations is notably recovered and more robust to the inherent noise.

2. PRELIMINARIES

Notation. We denote V = {v i } N i=1 as the set of nodes and E = {e ij } M ij=1 as the set of edges. With adjacent matrix A and node features X, an undirected graph is denoted as G = (A, X), where A ij = 1 means there is an edge e ij between v i and v j . X [i,:] ∈ R D is the D-dimension node feature of v i . Link prediction is to indicate the existence of query edges with labels Y that are not observed in A. GNNs for Link Prediction. We follow the common link prediction framework, i.e., graph autoencoders (Kipf & Welling, 2016b) , where the GNN architecture can be GCN (Kipf & Welling, 2016a) , GAT (Veličković et al., 2018 ), or SAGE (Hamilton et al., 2017) . Given a L-layer GNN, the graph representations H ∈ R |V|×D for each node v i ∈ V are obtained by a L-layer message propagation as the encoding process. For decoding, logits φ eij of each query edge e ij are computed with a readout function, e.g., the dot product φ eij = h i h j . Finally, the optimization objective is to minimize the binary classification loss, i.e., min L cls = eij ∈E train -y ij log σ(φ eij ) -(1-y ij )log 1-σ(φ eij ) , where σ(•) is the sigmoid function, and y ij = 1 for positive edges while y ij = 0 for negative ones. Topological denoising approaches. A natural way to tackle the input edge noise is to directly clean the noisy graph. Sampling-based methods, such as DropEdge (Rong et al., 2020) , Neu-ralSparse (Zheng et al., 2020), and PTDNet (Luo et al., 2021) , are proposed to remove the taskirrelevant edges. Besides, as GNNs can be easily fooled by adversarial network with only a few perturbed edges (Chen et al., 2018; Zhu et al., 2019; Entezari et al., 2020) , defending methods like GCN-jaccard (Wu et al., 2019) and GIB (Wu et al., 2020) are designed for pruning adversarial edges.



(a) Illustration of inherent edge noise. The GNN takes the graph topology A as inputs, and predicts the logits of unseen edges with labels Y . The noisy Ã and Ỹ are added with random edges for simulating the inherent edge noise as in Def. 3.1. (b) The basic GIB (left) and the proposed RGIB (right). I(•; •) here indicates the mutual information. To solve the intrinsic deficiency of basic GIB in tackling the edge noise, the RGIB learns the graph represenation H via a further balance of informative signals 1 , 3 , 4 regarding the H.

Figure 1: Link prediction with inherent edge noise (a) and the proposed RGIB principle (b).

