LIGHTGCL: SIMPLE YET EFFECTIVE GRAPH CON-TRASTIVE LEARNING FOR RECOMMENDATION

Abstract

Graph neural network (GNN) is a powerful learning approach for graph-based recommender systems. Recently, GNNs integrated with contrastive learning have shown superior performance in recommendation with their data augmentation schemes, aiming at dealing with highly sparse data. Despite their success, most existing graph contrastive learning methods either perform stochastic augmentation (e.g., node/edge perturbation) on the user-item interaction graph, or rely on the heuristic-based augmentation techniques (e.g., user clustering) for generating contrastive views. We argue that these methods cannot well preserve the intrinsic semantic structures and are easily biased by the noise perturbation. In this paper, we propose a simple yet effective graph contrastive learning paradigm LightGCL that mitigates these issues impairing the generality and robustness of CL-based recommenders. Our model exclusively utilizes singular value decomposition for contrastive augmentation, which enables the unconstrained structural refinement with global collaborative relation modeling. Experiments conducted on several benchmark datasets demonstrate the significant improvement in performance of our model over the state-of-the-arts. Further analyses demonstrate the superiority of LightGCL's robustness against data sparsity and popularity bias. The source code of our model is available at https://github.com/HKUDS/LightGCL.

1. INTRODUCTION

Graph neural networks (GNNs) have shown effectiveness in graph-based recommender systems by extracting local collaborative signals via neighborhood representation aggregation (Wang et al., 2019; Chen et al., 2020b) . In general, to learn user and item representations, GNN-based recommenders perform embedding propagation on the user-item interaction graph by stacking multiple message passing layers for exploring high-order connectivity (He et al., 2020; Zhang et al., 2019; Liu et al., 2021a) . Most GNN-based collaborative filtering models adhere to the supervised learning paradigm, requiring sufficient quality labelled data for model training. However, many practical recommendation scenarios struggle with the data sparsity issue in learning high-quality user and item representations from limited interaction data (Liu et al., 2021b; Lin et al., 2021) . To address the label scarcity issue, the benefits of contrastive learning have been brought into the recommendation for data augmentation (Wu et al., 2021) . The main idea of contrastive learning in enhancing the user and item representation is to research the agreement between the generated embedding views by contrasting the defined positive pairs with negative instance counterparts (Xie et al., 2022) . While contrastive learning has been shown to be effective in improving the performance of graphbased recommendation methods, the view generators serve as the core part of data augmentation through identifying accurate contrasting samples. Most of current graph contrastive learning (GCL) approaches employ heuristic-based contrastive view generators to maximize the mutual information between the input positive pairs and push apart negative instances (Wu et al., 2021; Yu et al., 2022a; Xia et al., 2022b) . To construct perturbed views, SGL (Wu et al., 2021) has been proposed to generate node pairs of positive view by corrupting the structural information of user-item interaction graph using stochastic augmentation strategies, e.g., node dropping and edge perturbation. To improve the graph contrastive learning in recommendation, SimGCL (Yu et al., 2022a) offers embedding augmentation with random noise perturbation. To work on identifying semantic neighbors of nodes (users and items), HCCF (Xia et al., 2022b) and NCL (Lin et al., 2022) are introduced to pursue consistent representations between the structurally adjacent nodes and semantic neighbors. Despite their effectiveness, state-of-the-art contrastive recommender systems suffer from several inherent limitations: i) Graph augmentation with random perturbation may lose useful structural information, which misleads the representation learning. ii) The success of heuristic-guided representation contrasting schemes is largely built upon the view generator, which limits the model generality and is vulnerable to the noisy user behaviors. iii) Most of current GNN-based contrastive recommenders are limited by the over-smoothing issue which leads to indistinguishable representations. In light of the above limitations and challenges, we revisit the graph contrastive learning paradigm for recommendation with a proposed simple yet effective augmentation method LightGCL. In our model, the graph augmentation is guided by singular value decomposition (SVD) to not only distill the useful information of user-item interactions but also inject the global collaborative context into the representation alignment of contrastive learning. Instead of generating two handcrafted augmented views, important semantic of user-item interactions can be well preserved with our robust graph contrastive learning paradigm. This enables our self-augmented representations to be reflective of both user-specific preferences and cross-user global dependencies. Our contributions are highlighted as follows: • In this paper, we enhance the recommender systems by designing a lightweight and robust graph contrastive learning framework to address the identified key challenges pertaining to this task. • We propose an effective and efficient contrastive learning paradigm LightGCL for graph augmentation. With the injection of global collaborative relations, our model can mitigate the issues brought by inaccurate contrastive signals. • Our method exhibits improved training efficiency compared to existing GCL-based approaches. • Extensive experiments on several real-world datasets justify the performance superiority of our LightGCL. In-depth analyzes demonstrate the rationality and robustness of LightGCL.

2. RELATED WORK

Graph Contrastive Learning for Recommendation. A promising line of recent studies has incorporated contrastive learning (CL) into graph-based recommenders, to address the label sparsity issue with self-supervision signals. Particularly, SGL (Wu et al., 2021) and SimGCL (Yu et al., 2022a) perform data augmentation over graph structure and embeddings with random dropout operations. However, such stochastic augmentation may drop important information, which may make the sparsity issue of inactive users even worse. Furthermore, some recent alternative CL-based recommenders, such as HCCF (Xia et al., 2022b) and NCL (Lin et al., 2022) , design heuristic-based strategies to construct view for embedding contrasting. Despite their effectiveness, their success heavily relies on their incorporated heuristics (e.g., the number of hyperedges or user clusters) for contrastive view generation, which can hardly be adaptive to different recommendation tasks. Self-Supervised Learning on Graphs. Recently, self-supervised learning (SSL) has advanced the graph learning paradigm by enhancing node representation from unlabeled graph data (Zhu et al., 2021a; b; Velickovic et al., 2019; Hassani & Khasahmadi, 2020; Peng et al., 2020; Zhu et al., 2020; Wu et al., 2022) . For example, to improve the predictive SSL paradigm, AutoSSL (Jin et al., 2022) automatically combines multiple pretext tasks for augmentation. Towards the line of contrastive SSL over graph structures, recent efforts focus on designing various graph contrastive learning methods (Yu et al., 2022b; Yin et al., 2022; Zhang et al., 2022; Xia et al., 2022a; Suresh et al., 2021) . For instance, SimGRACE Xia et al. (2022a) proposes to generate contrastive views with the GNN encoder perturbations. In AutoGCL Yin et al. (2022) , graph view generators are designed to be jointly trained with the graph encoder in an end-to-end way. Additionally, GCA (Zhu et al., 2021b) performs both topology-level and attribute-level data augmentation for contrastive view generation. In this method, important edges and features will be identified for adaptive augmentation. GraphCL (You et al., 2020) generates correlated graph representation views using various augmentation strategies, such as node/edge perturbation and attribute masking.



* Chao Huang is the corresponding author.

