GRAPH CONTRASTIVE LEARNING WITH MODEL PERTURBATION

Abstract

Graph contrastive learning (GCL) has achieved great success in pre-training graph neural networks (GNN) without ground-truth labels. The performance of GCL mainly rely on designing high quality contrastive views via data augmentation. However, finding desirable augmentations is difficult and requires cumbersome efforts due to the diverse modalities in graph data. In this work, we study model perturbation to perform efficient contrastive learning on graphs without using data augmentation. Instead of searching for the optimal combination among perturbing nodes, edges or attributes, we propose to conduct perturbation on the model architectures (i.e., GNNs). However, it is non-trivial to achieve effective perturbations on GNN models without performance dropping compared with its data augmentation counterparts. This is because data augmentation 1) makes complex perturbation in the graph space, so it is hard to mimic its effect in the model parameter space with a fixed noise distribution, and 2) has different disturbances even on the same nodes between two views owning to the randomness. Motivated by this, we propose a novel model perturbation framework -PERTURBGCL to pre-train GNN encoders. We focus on perturbing two key operations in a GNN, including message propagation and transformation. Specifically, we propose weightPrune to create a dynamic perturbed model to contrast with the target one by pruning its transformation weights according to their magnitudes. Contrasting the two models will lead to adaptive mining of the perturbation distribution from the data. Furthermore, we present randMP to disturb the steps of message propagation in two contrastive models. By randomly choosing the propagation steps during training, it helps to increase local variances of nodes between the contrastive views. Despite the simplicity, coupling the two strategies together enable us to perform effective contrastive learning on graphs with model perturbation. We conduct extensive experiments on 15 benchmarks. The results demonstrate the superiority of PERTURBGCL: it can achieve competitive results against strong baselines across both node-level and graphlevel tasks, while requiring shorter computation time. The code is available at https://anonymous.4open.science/r/PerturbGCL-F17D.

1. INTRODUCTION

Graph neural networks (GNN) (Kipf & Welling, 2016a; Hamilton et al., 2017; Gilmer et al., 2017) have become the de facto standard to model graph-structured data, such as social networks (Li & Goldwasser, 2019 ), molecules (Duvenaud et al., 2015) , and knowledge graphs (Arora, 2020). Nevertheless, GNNs require task-specific labels to supervise the training, which is impractical in many scenarios where annotating graphs is challenging and expensive (Sun et al., 2019) . Therefore, increasing efforts (Hou et al., 2022; Veličković et al., 2018; Hassani & Khasahmadi, 2020; Thakoor et al., 2022) have been made to train GNNs in an unsupervised fashion, so that the pre-trained model or learned representations can be directly applied to different downstream tasks. Recently, graph contrastive learning (GCL) becomes the state-of-the-art approach for both graphlevel (You et al., 2020; 2021; Suresh et al., 2021; Xu et al., 2021) and node-level (Qiu et al., 2020; Zhu et al., 2021b; Bielak et al., 2021; Thakoor et al., 2022) tasks. The general idea of GCL is to create two views of the original input using data augmentation (Jin et al., 2020) , and then encode them with two GNN branches that share the same architectures and weights (You et al., 2020) . Then,

