GRAPH REPRESENTATION LEARNING FOR MULTI-TASK SETTINGS: A META-LEARNING APPROACH

Abstract

Graph Neural Networks (GNNs) have become the state-of-the-art method for many applications on graph structured data. GNNs are a framework for graph representation learning, where a model learns to generate low dimensional node embeddings that encapsulate structural and feature-related information. GNNs are usually trained in an end-to-end fashion, leading to highly specialized node embeddings. While this approach achieves great results in the single-task setting, generating node embeddings that can be used to perform multiple tasks (with performance comparable to single-task models) is an open problem. We propose a novel representation learning strategy, based on meta-learning, capable of producing multi-task node embeddings. Our method avoids the difficulties arising when learning to perform multiple tasks concurrently by, instead, learning to quickly (i.e. with a few steps of gradient descent) adapt to multiple tasks singularly. We show that the embeddings produced by our method can be used to perform multiple tasks with comparable or higher performance than both single-task and multitask end-to-end models. Our method is model-agnostic and task-agnostic and can hence be applied to a wide variety of multi-task domains.

Original Embeddings

Transferred Embeddings Graph Neural Networks (GNNs) are deep learning models that operate on graph structured data, and have become one of the main topics of the deep learning research community. Part of their success is given by great empirical performance on many graph-related tasks. Three tasks in particular, with many practical applications, have received the most attention: graph classification, node classification, and link prediction. GNNs are centered around the concept of node representation learning, and typically follow the same architectural pattern with an encoderdecoder structure (Hamilton et al., 2017; Chami et al., 2020; Wu et al., 2020) . The encoder produces node embeddings (low-dimensional vectors capturing relevant structural and feature-related information about each node), while the decoder uses the embeddings to carry out the desired downstream task. The model is then trained in an end-to-end manner, giving rise to highly specialized node embeddings. While this can lead to state-of-the-art performance, it also affects the generalization and reusability of the embeddings. In fact, taking the encoder from a GNN trained on a given task and using its node embeddings to train a decoder for a different task leads to substantial performance loss, as shown in Figure 1 . 1



Figure 1: Performance drop when transferring node embeddings between tasks on (a) Node Classification (NC), (b)Graph Classification (GC), and (c) Link Prediction (LP) on the ENZYMES dataset. On the horizontal axis, "x ->y" indicates that the embeddings obtained from a model trained on task x are used to train a network for task y.

