MULTI-TASK SELF-SUPERVISED GRAPH NEURAL NET-WORKS ENABLE STRONGER TASK GENERALIZATION

Abstract

Self-supervised learning (SSL) for graph neural networks (GNNs) has attracted increasing attention from the graph machine learning community in recent years, owing to its capability to learn performant node embeddings without costly label information. One weakness of conventional SSL frameworks for GNNs is that they learn through a single philosophy, such as mutual information maximization or generative reconstruction. When applied to various downstream tasks, these frameworks rarely perform equally well for every task, because one philosophy may not span the extensive knowledge required for all tasks. To enhance the task generalization across tasks, as an important first step forward in exploring fundamental graph models, we introduce PARETOGNN, a multi-task SSL framework for node representation learning over graphs. Specifically, PARETOGNN is self-supervised by manifold pretext tasks observing multiple philosophies. To reconcile different philosophies, we explore a multiple-gradient descent algorithm, such that PARE-TOGNN actively learns from every pretext task while minimizing potential conflicts. We conduct comprehensive experiments over four downstream tasks (i.e., node classification, node clustering, link prediction, and partition prediction), and our proposal achieves the best overall performance across tasks on 11 widely adopted benchmark datasets. Besides, we observe that learning from multiple philosophies enhances not only the task generalization but also the single task performances, demonstrating that PARETOGNN achieves better task generalization via the disjoint yet complementary knowledge learned from different philosophies.

1. INTRODUCTION

Graph-structured data is ubiquitous in the real world (McAuley et al., 2015; Hu et al., 2020) . To model the rich underlying knowledge for graphs, graph neural networks (GNNs) have been proposed and achieved outstanding performance on various tasks, such as node classification (Kipf & Welling, 2016a; Hamilton et al., 2017) , link prediction (Zhang & Chen, 2018; Zhao et al., 2022b) , node clustering (Bianchi et al., 2020; You et al., 2020b) , etc. These tasks form the archetypes of many real-world practical applications, such as recommendation systems (Ying et al., 2018; Fan et al., 2019) , predictive user behavior models (Pal et al., 2020; Zhao et al., 2021a; Zhang et al., 2021a) . Existing works for graphs serve well to make progress on narrow experts and guarantee their effectiveness on mostly one task or two. However, given a graph learning framework, its promising performance on one task may not (and usually does not) translate to competitive results on other tasks. Consistent task generalization across various tasks and datasets is a significant and well-studied research topic in other domains (Wang et al., 2018; Yu et al., 2020) . Results from the Natural Language Processing (Radford et al., 2019; Sanh et al., 2021) and Computer Vision (Doersch & Zisserman, 2017; Ni et al., 2021) have shown that models enhanced by self-supervised learning (SSL) over multiple pretext tasks observing diverse philosophies can achieve strong task generalization and learn intrinsic patterns that are transferable to multiple downstream tasks. Intuitively, SSL over multiple pretext tasks greatly reduces the risk of overfitting (Baxter, 1997; Ruder, 2017) , because § Corresponding Author. 1

availability

Our code is publicly available at https://github.com/jumxglhf/ParetoGNN.

