ROBUST GRAPH REPRESENTATION LEARNING VIA PREDICTIVE CODING

Abstract

Graph neural networks have recently shown outstanding results in diverse types of tasks in machine learning, providing interdisciplinary state-of-the-art performance on structured data. However, they have been proved to be vulnerable to imperceptible adversarial attacks and shown to be unfit for out-of-distribution generalisation. Here, we address this problem by introducing a novel message-passing scheme based on the theory of predictive coding, an energy-based alternative to back-propagation that has its roots in neuroscience. As both graph convolution and predictive coding can be seen as low-pass filtering mechanisms, we postulate that predictive coding adds a second efficient filter to the messaging passing process which enhances the robustness of the learned representation. Through an extensive set of experiments, we show that the proposed model attains comparable performance to its graph convolution network counterpart, delivering strictly better performance on inductive tasks. Most importantly, we show that the energy minimization enhances the robustness of the produced presentation and can be leveraged to further calibrate our models and provide representations that are more robust against advanced graph adversarial attacks.

1. INTRODUCTION

Extracting information from structured data has always been an active area of research in machine learning. This, mixed with the rise of deep neural networks as the main model of the field, has led to the development of graph neural networks (GNNs). These models have achieved results in diverse types of tasks in machine learning, providing interdisciplinary state-of-the-art performance in areas such as e-commerce and financial fraud detection (Zhang et al., 2022; Wang et al., 2019) , drug and advanced material discovery (Bongini et al., 2021; Zhao et al., 2021; Xiong et al., 2019 ), recommender systems (Wu et al., 2021 ), and social networks (Liao et al., 2018) . Their power lies in a message passing mechanism among vertices of a graph, performed iteratively at different levels of hierarchy of a deep network. Popular examples of these models are graph convolutional networks (GCNs) (Welling & Kipf, 2016), and graph attention networks (Veličković et al., 2017) . Despite the aforementioned results and performance obtained in the last years, these models have been shown to lack robustness and to be vulnerable against carefully-crafted adversarial attacks (Zügner et al., 2018; Günnemann, 2022) . They have in fact been proved to be vulnerable and susceptible to imperceptible adversarial attacks (Dai et al., 2018; Zügner & Günnemann, 2019; Günnemann, 2022) and unfit for out-of-distribution generalisation (Hu et al., 2020) . This prevents GNNs from being used in critical tasks, where misleading predictions may lead to serious consequences, or maliciously manipulated signals may lead to the loss of a large amount of money. More generally, robustness has always been a problem of deep learning models, highlighted by the famous example of a panda picture being classified as a gibbon with almost perfect confidence after the addition of a small amount of adversarial noise (Akhtar & Mian, 2018) . To address this problem, an influential work has shown that it is possible to treat a classifier as an energy-based generative model, and train the joint distribution of a data point and its label to improve robustness and calibration Grathwohl et al. (2019) . Justified by this result, this work studies the robustness of GNNs trained using an energy-based training algorithm called predictive coding (PC), originally developed to model information processing in hierarchical generative networks present in the neocortex (Rao & Ballard, 1999) . Despite not being initially developed to perform machine learning tasks, recent works have been analyzing possible applications of PC in deep learning. This is motivated by inter-

