MOTIF-INDUCED GRAPH NORMALIZATION

Abstract

Graph Neural Networks (GNNs) have emerged as a powerful category of learning architecture for handling graph-structured data. However, the existing GNNs usually follow the neighborhood aggregation scheme, ignoring the structural characteristics in the node-induced subgraphs, which limits their expressiveness for the downstream tasks of both the graph-and node-level predictions. In this paper, we strive to strengthen the general discriminative capabilities of GNNs by devising a dedicated plug-and-play normalization scheme, termed as Motif-induced Normalization (MotifNorm), that explicitly considers the intra-connection information within each node-induced subgraph. To this end, we embed the motif-induced structural weights at the beginning and the end of the standard BatchNorm, as well as incorporate the graph instance-specific statistics for improved distinguishable capabilities. In the meantime, we provide the theoretical analysis to support that the MotifNorm scheme can help alleviate the over-smoothing issue, which is conducive to designing deeper GNNs. Experimental results on eight popular benchmarks across all the tasks of the graph, node, as well as link property predictions, demonstrate the effectiveness of the proposed method. Our code is made available in the supplementary material.

1. INTRODUCTION

In recent years, Graph Neural Networks (GNNs) have emerged as the mainstream deep learning architectures to analyze irregular samples where information is present in the form of graphs, which usually employs the message-passing aggregation mechanism to encode node features from local neighborhood representations (Kipf & Welling, 2017; Veličković et al., 2018; Xu et al., 2019; Yang et al., 2020b; Hao et al., 2021; Dwivedi et al., 2022b) . As a powerful class of graph-relevant networks, these architectures have shown encouraging performance in various domains such as cell clustering (Li et al., 2022; Alghamdi et al., 2021 ), chemical prediction (Tavakoli et al., 2022; Zhong et al., 2022 ), social networks (Bouritsas et al., 2022; Dwivedi et al., 2022b ), traffic networks (Bui et al., 2021; Li & Zhu, 2021 ), combinatorial optimization (Schuetz et al., 2022; Cappart et al., 2021) , and power grids (Boyaci et al., 2021; Chen et al., 2022a) . However, the commonly used message-passing mechanism, i.e., aggregating the representations from neighborhoods, limits the expressive capability of GNNs to address the subtree-isomorphic phenomenon prevalent in the real world (Wijesinghe & Wang, 2022) . As shown in Figure 1(a) , subgraphs S v1 , S v2 induced by v 1 , v 2 are subtree-isomorphic, which decreases the GNNs' expressivity in graph-level and node-level prediction, i.e., (1) Graph-level: Straightforward neighborhood aggregations, ignoring the characterises of the node-induced subgraphs, lead to complete indistinguishability in the subtree-isomorphic case, which thus limits the GNNs' expressivity to be bottlenecked by the Weisfeiler-Leman (WL) (Weisfeiler & Leman, 1968) test. (2) Node-level: Under the background of over-smoothing (as illustrated in Figure 1(b) ), the smoothing problem among the root representations of subtree-isomorphic substructures will become worser when aggregating the similar representations from their neighborhoods without structural characterises be considered. In this paper, we strive to develop a general framework, compensating for the ignored characteristics among the node-induced structures, to improve the graph expressivity over the prevalent messagepassing GNNs for various graph downstream tasks, e.g., graph, node and link predictions. Driven by the fact that deep models usually follow the CNA architecture, i.e., a stack of convolution, normalization and activation, where the normalization module generally follows GNNs convolution operations, we accordingly focus on developing a higher-expressivity generalized normalization scheme to enhance the discriminative abilities of various GNNs' architectures. Thus, the question is, "how to design such a powerful and general normalization module with the characteristics of node-induced substructures embedded ?" To address this challenge, this paper devises an innovative normalization mechanism, termed as Motif-induced Normalization (MotifNorm), that explicitly considers the intra-connection information in each node-induced subgraph (i.e., motif (Leskovec, 2021)) and embeds the achieved structural factors into the normalization module to improve the expressive power of GNNs. Specifically, we start by empirically disentangling the standard normalization into two stages, i.e., centering & scaling (CS) and affine transformation (AT) operations. We then concentrate on mining the intra-connection information in the node-induced subgraphs, and develop two elaborated strategies, termed as representation calibration and representation enhancement, to embed the achieved structural information into CS and AT operations. Eventually, we demonstrate that MotifNorm can generally improve the GNNs' expressivity for the task of graph, node and link predictions via extensive experimental analysis. In sum, the contributions of this work can be summarized as follows: • Driven by the conjecture that a higher-expressivity normalization with abundant graph structure power can generally strengthen the GNNs' performance, we develop a novel normalization scheme, termed as MotifNorm, to embed structural information into GNNs' aggregations. • We develop two elaborated strategies, i.e., representation calibration and representation enhancement, tailored to embed the motif-induced structural factor into CS and AT operations for the establishment of MotifNorm in GNNs. • We provide extensive experimental analysis on eight popular benchmarks across various domains, including graph, node and link property predictions, demonstrating that the proposed model is efficient and can consistently yield encouraging results. It is worth mentioning that MotifNorm maintains the computational simplicity, which is beneficial to the model training for highly complicated tasks.

2. RELATED WORKS

In this section, we briefly introduce the existing normalization architectures in GNNs, which are commonly specific to the type of downstream tasks, i.e., graph-level and node-level tasks. Graph-level Normalization. 



Figure 1: (a) The illustration of the subtree-isomorphic phenomenon, where two subgraphs S v1 and S v2 are induced by root node v 1 and v 2 with the same degree k = 4, but the connection information among neighborhoods is different. (b) The t-SNE illustration of over-smoothing issue on Cora dataset when GraphSage layer is up to 20. Here, we show the first three categories for visualization.

To address graph-level representation learning, Xu et al. (2019) adopt the standard BatchNorm (Ioffe & Szegedy, 2015) module to GIN as a plug-in component to stabilize the model's training. Based on the BatchNorm, Dwivedi et al. (2022a) normalize the node features with respect to the graph size to resize the feature space, and propose the ExpreNorm. To address the expressiveness degradation of GNNs for highly regular graphs, Cai et al. (2021) propose the GraphNorm with a learnable parameter for each feature dimension based on instance normalization. To adopt the advances of different normalizations, Chen et al. (2022c) propose UnityNorm by op-

