AGENT-BASED GRAPH NEURAL NETWORKS

Abstract

We present a novel graph neural network we call AgentNet, which is designed specifically for graph-level tasks. AgentNet is inspired by sublinear algorithms, featuring a computational complexity that is independent of the graph size. The architecture of AgentNet differs fundamentally from the architectures of traditional graph neural networks. In AgentNet, some trained neural agents intelligently walk the graph, and then collectively decide on the output. We provide an extensive theoretical analysis of AgentNet: We show that the agents can learn to systematically explore their neighborhood and that AgentNet can distinguish some structures that are even indistinguishable by 2-WL. Moreover, AgentNet is able to separate any two graphs which are sufficiently different in terms of subgraphs. We confirm these theoretical results with synthetic experiments on hard-to-distinguish graphs and real-world graph classification tasks. In both cases, we compare favorably not only to standard GNNs but also to computationally more expensive GNN extensions.

1. INTRODUCTION

Graphs and networks are prominent tools to model various kinds of data in almost every branch of science. Due to this, graph classification problems also have a crucial role in a wide range of applications from biology to social science. In many of these applications, the success of algorithms is often attributed to recognizing the presence or absence of specific substructures, e.g. atomic groups in case of molecule and protein functions, or cliques in case of social networks [10; 77; 21; 23; 66; 5] . This suggests that some parts of the graph are "more important" than others, and hence it is an essential aspect of any successful classification algorithm to find and focus on these parts. In recent years, Graph Neural Networks (GNNs) have been established as one of the most prominent tools for graph classification tasks. Traditionally, all successful GNNs are based on some variant of the message passing framework [3; 69] . In these GNNs, all nodes in the graph exchange messages with their neighbors for a fixed number of rounds, and then the outputs of all nodes are combined, usually by summing them [27; 52] , to make the final graph-level decision. It is natural to wonder if all this computation is actually necessary. Furthermore, since traditional GNNs are also known to have strong limitations in terms of expressiveness, recent works have developed a range of more expressive GNN variants; these usually come with an even higher computational complexity, while often still not being able to recognize some simple substructures. This complexity makes the use of these expressive GNNs problematic even for graphs with hundreds of nodes, and potentially impossible when we need to process graphs with thousands or even more nodes. However, graphs of this size are common in many applications, e.g. if we consider proteins [65; 72], large molecules [79] or social graphs [7; 5] . In light of all this, we propose to move away from traditional message-passing and approach graphlevel tasks differently. We introduce AgentNet -a novel GNN architecture specifically focused on these tasks. AgentNet is based on a collection of trained neural agents, that intelligently walk the graph, and then collectively classify it (see Figure 1 ). These agents are able to retrieve information from the node they are occupying, its neighboring nodes, and other agents that occupy the same node. This information is used to update the agent's state and the state of the occupied node. Finally, the agent then chooses a neighboring node to transition to, based on its own state and the state of the neighboring nodes. As we will show later, even with a very naive policy, an agent can already recognize cliques and cycles, which is impossible with traditional GNNs. Correspondence to martinkus@ethz.ch. One of the main advantages of AgentNet is that its computational complexity only depends on the node degree, the number of agents, and the number of steps. This means that if a specific graph problem does not require the entire graph to be observed, then our model can often solve it using less than n operations, where n is the number of nodes. The study of such sublinear algorithms is a popular topic in graph mining [35; 26] ; it is known that many relevant tasks can be solved in a sublinear manner. For example, our approach can recognize if one graph has more triangles than another, or estimate the frequency of certain substructures in the graph -in sublinear time! AgentNet also has a strong advantage in settings where e.g. the relevant nodes for our task can be easily recognized based on their node features. In these cases, an agent can learn to walk only along these nodes of the graph, hence only collecting information that is relevant to the task at hand. The amount of collected information increases linearly with the number of steps. In contrast to this, a standard message-passing GNN always (indirectly) processes the entire multi-hop neighborhood around each node, and hence it is often difficult to identify the useful part of the information from this neighborhood due to oversmoothing or oversquashing effects [46; 2] caused by an exponential increase in aggregated information with the number of steps. One popular approach that can partially combat this has been attention [70] as it allows for soft gating of node interactions. While our approach also uses attention for agent transition sampling, the transitions are hard. More importantly, these agent transitions allow for the reduction of computational complexity and increase model expressiveness, both things the standard attention models do not provide. 



Figure 1: AgentNet architecture. We have many neural agents walking the graph (a). Each agent at every step records information on the node, investigates its neighborhood, and makes a probabilistic transition to another neighbor (b). If the agent has walked a cycle (c) or a clique (d) it can notice.

Xu et al. [75]  andMorris et al. [53]  established the equivalence of message passing GNNs and the first order Weisfeiler-Lehman (1-WL) test. This spurred research into more expressiveGNN architectures. Sato et al. [64] and Abboud et al. [1]  proposed to use random node features for unique node identification. As pointed out byLoukas [48], message passing GNNs with truly unique identifiers are universal. Unfortunately, such methods generalize poorly to new graphs[59].Vignac  et al. [71]  propose propagating matrices of order equal to the graph size as messages instead of vectors to achieve a permutation equivariant unique identification scheme. Other possible expressivenessimproving node feature augmentations include distance encoding[45], spectral features [4; 25] or orbit counts[11]. However, such methods require domain knowledge to choose what structural information to encode. Pre-computing the required information can also be quite expensive[11]. An alternative to this is directly working with higher-order graph representations [53; 51], which can directly bring k-WL expressiveness at the cost of operating on k-th order graph representations. To improve this, methods that consider only a part of the higher-order interactions have been proposed[55; 56].

