SUFFICIENT SUBGRAPH EMBEDDING MEMORY FOR CONTINUAL GRAPH REPRESENTATION LEARNING Anonymous

Abstract

Memory replay, which constructs a buffer to store representative samples and retrain the model over the buffer to maintain its performance over existing tasks, has shown great success for continual learning with Euclidean data. Directly applying it to graph data, however, can lead to the memory explosion problem due to the necessity to consider explicit topological connections of representative nodes. To this end, we present Parameter Decoupled Graph Neural Networks (PDGNNs) with Sufficient Subgraph Embedding Memory (SSEM) to fully utilize the explicit topological information for memory replay and reduce the memory space complexity from O(nd L ) to O(n), where n is the memory buffer size, d is the average node degree, and L is the range of neighborhood aggregation. Specifically, PDGNNs decouple trainable parameters from the computation subgraphs via Sufficient Subgraph Embeddings (SSEs), which compress subgraphs into vectors (i.e., SSEs) to reduce the memory consumption. Besides, we discover a pseudo-training effect in memory based continual graph learning, which does not exist in continual learning on Euclidean data without topological connection (e.g., individual images). Based on the discovery, we develop a novel coverage maximization sampling strategy to enhance the performance when the memory budget is tight. Thorough empirical studies demonstrate that PDGNNs with SSEM outperform state-of-the-art techniques for both class-incremental and task-incremental settings.

1. INTRODUCTION

Continual graph representation learning (Liu et al., 2021; Zhou & Cao, 2021; Zhang et al., 2021) , which aims to accommodate new types of emerging nodes in a graph and their associated edges without interfering with the model performance over existing nodes, is an emerging area that attracts increasingly more attention recently. It exhibits enormous value in various practical applications, especially in the case where graphs are relatively large and retraining a new model over the entire graph is computationally infeasible. For instance, in a social network, a community detection model has to keep adapting its parameters based on nodes from newly emerged communities; in a citation network, a document classifier needs to continuously update its parameters to distinguish the documents of newly emerged research fields. Memory replay (Rebuffi et al., 2017; Lopez-Paz & Ranzato, 2017; Aljundi et al., 2019; Shin et al., 2017) , which stores representative samples in a buffer for retraining the model to maintain its performance over existing tasks, exhibits great success in preventing catastrophic forgetting for various continual learning tasks, e.g., computer vision and reinforcement learning (Kirkpatrick et al., 2017; Li & Hoiem, 2017; Aljundi et al., 2018; Rusu et al., 2016) . Directly applying memory replay to graph data with message passing based graph neural networks (GNNs) (Gilmer et al., 2017; Kipf & Welling, 2016; Veličković et al., 2017) , however, could give rise to the memory explosion problem. Specifically, due to the message passing over the topological connections in graphs, retraining an L-layer GNN (Figure 1 a ) with n buffered nodes would require storing O(nd L ) nodes (Chiang et al., 2019; Chen et al., 2017) (the number of edges is not counted yet) in the buffer, where d is the average node degree. Take the Reddit dataset (Hamilton et al., 2017) for an example, its average node degree is 492, the buffer size will easily be intractable even with a 2 layer GNN. To overcome this issue, Experience Replay based GNN (ER-GNN) (Zhou & Cao, 2021) stores representative nodes in the buffer but completely ignores the topological information (Figure 1 b ). Feature graph network (FGN) (Wang et al., 2020a) implicitly encodes node proximity with the inner products between the features of the target node and its neighbors. However, the explicit topological connections are completely ignored and message passing is no longer feasible on the graph.

