

Abstract

Graph neural networks (GNNs) have shown broad applicability in a variety of domains. Some of these domains, such as social networks and product recommendations, are fertile ground for malicious users and behavior. In this paper, we show that GNNs are vulnerable to the extremely limited scenario of a single-node adversarial example, where the node cannot be picked by the attacker. That is, an attacker can force the GNN to classify any target node to a chosen label by only slightly perturbing another single arbitrary node in the graph, even when not being able to pick that specific attacker node. When the adversary is allowed to pick a specific attacker node, the attack is even more effective. We show that this attack is effective across various GNN types (e.g., GraphSAGE, GCN, GAT, and GIN), across a variety of real-world datasets, and as a targeted and non-targeted attack. Our code is available anonymously at https://github.com/gnnattack/SINGLE.

1. I N T R O D U C T I O N

Graph neural networks (GNNs) (Scarselli et al., 2008; Micheli, 2009) have recently shown sharply increasing popularity due to their generality and computation-efficiency (Duvenaud et al., 2015; Li et al., 2016; Kipf & Welling, 2017; Hamilton et al., 2017; Veličković et al., 2018; Xu et al., 2019b) . Graph-structured data underlie a plethora of domains such as citation networks (Sen et al., 2008) , social networks (Leskovec & Mcauley, 2012; Ribeiro et al., 2017; 2018) , knowledge graphs (Wang et al., 2018; Trivedi et al., 2017; Schlichtkrull et al., 2018) , and product recommendations (Shchur et al., 2018) . Therefore, GNNs are applicable for a variety of real-world structured data. While most work in this field has focused on improving the accuracy of GNNs and applying them to a growing number of domains, only a few past works have explored the vulnerability of GNNs to adversarial examples. Consider the following scenario: a malicious user joins a social network such as Twitter or Facebook. The malicious user mocks the behavior of a benign user, establishes connections with other users, and submits benign posts. After some time, the user submits a new adversarially crafted post, which might seem irregular but overall benign. Since the GNN represents every user according to all the user's posts, this new post perturbs the representation of the user as seen by a GNN. As a result, another, specific benign user gets blocked from the network; alternatively, another malicious user submits a hateful post -but does not get blocked. This scenario is illustrated in Figure 1 . In this paper, we show the feasibility of such a troublesome scenario: a single attacker node can perturb its own representation, such that another node will be misclassified as a label of the attacker's choice. Most previous work on adversarial examples in GNNs required the perturbation to span multiple nodes, which in reality requires the cooperation of multiple attackers. For example, the pioneering work of Zügner et al. ( 2018) perturbed a set of attacker nodes; Bojchevski & Günnemann (2019a) perturb edges that are covered by a set of nodes. Further and in contrast with existing work, we show that perturbing a single node is more harmful than perturbing a single edge. In this paper, we present a first a single-node adversarial attack on graph neural networks. If the adversary is allowed to choose the attacker node, for example, by hacking into an existing account, the efficiency of the attack significantly increases. We present two approaches for choosing the attacker: a white-box gradient-based approach, and a black-box, model-free approach that relies on graph topology. Finally, we perform a comprehensive experimental evaluation of our approach on multiple datasets and GNN architectures.

2. P R E L I M I N A R I E S

Let G = {G i } N G i=1 be a set of graphs. Each graph G = (V, E, X) ∈ G has a set of nodes V and a set of edges E ⊆ V × V, where (u, v) ∈ E denotes an edge from a node u ∈ V to a node v ∈ V. X ∈ R N ×D is a matrix of D-dimensional node features. The i-th row of X is the feature vector of the node v i ∈ V and is denoted asGraph neural networks GNNs operate by iteratively propagating neural messages between neighboring nodes. Every GNN layer updates the representation of every node by aggregating its current representation with the current representations of its neighbors.Formally, each node is associated with an initial representation xThis representation is considered as the given features of the node. Then, a GNN layer updates each node's representation given its neighbors, yielding h(1) v ∈ R d1 for every v ∈ V. In general, the -th layer of a GNN is a function that updates a node's representation by combining it with its neighbors:where N v is the set of direct neighbors of v:The COMBINE function is what mostly distinguishes GNN types. For example, graph convolutional networks (GCN) (Kipf & Welling, 2017) define a layer as:where c u,v is a normalization factor usually set toAfter such aggregation iterations, every node representation captures information from all nodes within its -hop neighborhood. The total number of layers L is usually determined empirically as a hyperparameter. In the node classification scenario, we use the final representation h L v to classify v. For brevity, we focus our definitions on the semi-supervised transductive node classification goal, where the dataset contains a single graph G, and the split into training and test sets is across nodes in the same graph. Nonetheless, these definitions can be trivially generalized to the inductive setting, where the dataset contains multiple graphs, the split into training and test sets is between graphs, and the test nodes are unseen during training. Given the training set, the goal is to learn a model f θ : (G, V) → Y that will classify the rest of the nodes correctly. During training, the model f θ thus minimizes the loss over the given labels, using J (•, •), which typically is the cross-entropy loss:

