Fast Yet Effective Graph Unlearning through Influence Analysis

Abstract

Recent evolving data privacy policies and regulations have led to increasing interest in the problem of removing information from a machine learning model. In this paper, we consider Graph Neural Networks (GNNs) as the target model, and study the problem of edge unlearning in GNNs, i.e., learning a new GNN model as if a specified set of edges never existed in the training graph. Despite its practical importance, the problem remains elusive due to the non-convexity nature of GNNs and the large scale of the input graph. Our main technical contribution is three-fold: 1) we cast the problem of fast edge unlearning as estimating the influence of the edges to be removed and eliminating the estimated influence from the original model in one-shot; 2) we design a computationally and memory efficient algorithm named EraEdge for edge influence estimation and unlearning; 3) under standard regularity conditions, we prove that EraEdge converges to the desired model. A comprehensive set of experiments on four prominent GNN models and three benchmark graph datasets demonstrate that EraEdge achieves significant speedup gains over retraining from scratch without sacrificing the model accuracy too much. The speedup is even more outstanding on large graphs. Furthermore, EraEdge witnesses significantly higher model accuracy than the existing GNN unlearning approaches.

1. INTRODUCTION

Recent legislation such as the General Data Protection Regulation (GDPR) (Regulation, 2018) , the California Consumer Privacy Act (CCPA) (Pardau, 2018) , and the Personal Information Protection and Electronic Documents Act (PIPEDA) (Parliament, 2000) requires companies to remove private user data upon request. This has prompted the discussion of "right to be forgotten" (Kwak et al., 2017) , which entitles users to get more control over their data by deleting it from learned models. In case a company has already used the data collected from users to train their machine learning (ML) models, these models need to be manipulated accordingly to reflect data deletion requests. In this paper, we consider Graph Neural Networks (GNNs) that receive frequent edge removal requests as our target ML model. For example, consider a social network graph collected from an online social network platform that witnesses frequent insertion/deletion of users (nodes) and/or change of social relations between users (edges). Some of these structural changes can be accompanied with users' withdrawal requests of their data. In this paper, we only consider the requests of removing social relations (edges). Then the owner of the platform is obligated by the laws to remove the effect of the requested edges, so that the GNN models trained on the graph do not "remember" their corresponding social interactions. In general, a naive solution to deleting user data from a trained ML model is to retrain the model on the training data which excludes the samples to be removed. However, retraining a model from scratch can be prohibitively expensive, especially for complex ML models and large training data. To address this issue, numerous efforts (Mahadevan & Mathioudakis, 2021; Brophy & Lowd, 2021; Cauwenberghs & Poggio, 2000; Cao & Yang, 2015) have been spent on designing efficient unlearning methods that can remove the effect of some particular data samples without model retraining. One of the main challenges is how to estimate the effects of a given training sample on model parameters (Golatkar et al., 2021) , which has led to research focusing on simpler convex learning problem such a linear/logistic regression (Mahadevan & Mathioudakis, 2021), random forests (Brophy & Lowd, 2021), support vector machines (Cauwenberghs & Poggio, 2000) and k-means clustering (Ginart et al., 2019) , for which a theoretical analysis was established. Although there have been some works on unlearning in deep neural networks (Golatkar et al., 2020a; b; 2021; Guo et al., 2020) , very few works (Chen et al., 2022; Chien et al., 2022) have investigated efficient unlearning in GNNs. These works can be distinguished into two categories: exact and approximate GNN unlearning. GraphEraser (Chen et al., 2022) is an exact unlearning method that retrains the GNN model on the graph that excludes the to-be-removed edges in an efficient way. It follows the basic idea of Sharded, Isolated, Sliced, and Aggregated (SISA) method (Bourtoule et al., 2021) and splits the training graph into several disjoint shards and train each shard model separately. Upon receiving an unlearning request, the model provider retrains only the affected shard model. Despite its efficiency, partitioning training data into disjoint shards severely damages the graph structure and thus incurs significant loss of target model accuracy (will be shown in our empirical evaluation). On the other hand, approximate GNN unlearning returns a sanitized GNN model which is statistically indistinguishable from the retrained model. Certified graph unlearning (Chien et al., 2022) can provide a theoretical privacy guarantee of the approximate GNN unlearning. However, it only considers some simplified GNN architectures such as simple graph convolutions (SGC) and their generalized PageRank (GPR) extensions. We aim to design the efficient approximate unlearning solutions that are model-agnostic, i.e., without making any assumption of the nature and complexity of the model. In this paper, we design an efficient edge unlearning algorithm named EraEdge which directly modifies the parameters of the pre-trained model in one shot to remove the influence of the requested edges from the model. By adapting the idea of treating removal of data points as upweighting these data points (Koh & Liang, 2017), we compute the influence of the requested edges on the model as the change in model parameters due to upweighting these edges. However, due to the aggregation function of GNN models, it is non-trivial to estimate the change on GNN parameters as removing an edge e(v i , v j ) could affect not only the neighbors of v i and v j but also on multi-hops. Thus we design a new influence derivation method that takes the aggregation effect of GNN models into consideration when estimating the change in parameters. We address several theoretical and practical challenges of influence derivation due to the non-convexity nature of GNNs. To demonstrate the efficiency and effectiveness of EraEdge, we systematically represent the empirical trade-off space between unlearning efficiency (i.e., the time performance of unlearning, model accuracy (i.e., the quality of the unlearned model), and unlearning efficacy (i.e., the extent to which the unlearned model has forgotten the removed edges). Our results show that, first, while achieving similar model accuracy and unlearning efficacy as the retrained model, EraEdge is significantly faster than retraining. For example, it speeds up the training time by 5.03× for GCN model on Cora dataset. The speedup is even more outstanding on larger graphs; it can be two orders of magnitude on CS graph which contains around 160K edges. Second, EraEdge outperforms Gra-phEraser (Chen et al., 2022) considerably in model accuracy. For example, EraEdge witnesses an increase of 50% in model accuracy on Cora dataset compared to GraphEraser. Furthermore, EraEdge is much faster than GraphEraser especially on large graphs. For instance, EraEdge is 5.8× faster than GraphEraser on CS dataset. Additionally, EraEdge outperforms certified graph unlearning (CGU) (Chien et al., 2022) significantly in terms of target model accuracy and unlearning efficacy, while it demonstrates comparable edge forgetting ability as CGU. In summary, we made the following four main contributions: 1) We cast the problem of edge unlearning as estimating the influence of a set of edges on GNNs while taking the aggregation effects of GNN models into consideration; 2) We design EraEdge, a computationally and memory efficient algorithm that applies a one-shot update to the original model by removing the estimated influence of the removed edges from the model; 3) We address several theoretical and practical challenges of deriving edge influence, and prove that EraEdge converges to the desired model under standard regularity conditions; 4) We perform an extensive set of experiments on four prominent GNN models and three benchmark graph datasets, and demonstrate the efficiency and effectiveness of EraEdge.

2. GRAPH NEURAL NETWORK

Given a graph G(V, E) that consists of a set of nodes V and their edges E, the goal of a Graph Neural Network (GNN) model is to learn a representation vector h (embedding) for each node v in G that can be used in downstream tasks (e.g., node classification, link prediction). A GNN model updates the node embeddings through aggregating its neighbors' representations. The embedding corresponding to each node v i ∈ V at layer l is updated according to v i 's graph

