GRETO: REMEDYING DYNAMIC GRAPH TOPOLOGY-TASK DISCORDANCE VIA TARGET HOMOPHILY

Abstract

Dynamic graphs are ubiquitous across disciplines where observations usually change over time. Regressions on dynamic graphs often contribute to diverse critical tasks, such as climate early-warning and traffic controlling. Existing homophily Graph Neural Networks (GNNs) adopt physical connections or feature similarity as adjacent matrix to perform node-level aggregations. However, on dynamic graphs with diverse node-wise relations, exploiting a pre-defined fixed topology for message passing inevitably leads to the aggregations of targetdeviated neighbors. We designate such phenomenon as the topology-task discordance, which naturally challenges the homophily assumption. In this work, we revisit node-wise relationships and explore novel homophily measurements on dynamic graphs with both signs and distances, capturing multiple node-level spatial relations and temporal evolutions. We discover that advancing homophily aggregations to signed target-oriented message passing can effectively resolve the discordance and promote aggregation capacity. Therefore, a GReTo is proposed, which performs signed message passing in immediate neighborhood, and exploits both local environments and target awareness to realize high-order message propagation. Empirically, our solution achieves significant improvements against best baselines, notably improving 24.79% on KnowAir and 3.60% on Metr-LA.

1. INTRODUCTION

Graph-structured data mining has become a popular technique in numerous disciplines, such as social networks (You et al., 2022) , road networks (Chen et al., 2020) , and molecule analysis (Abu-El-Haija et al., 2019) . However, existing solutions to graph mining usually make the assumption of homophily on graphs where connected nodes tend to share similar features or have the same labels (targets). Actually, in real-world graphs, the homophily assumption does not always hold on (Zhu et al., 2020) . Thus, Graph Neural Networks (GNNs) considering heterophily are proposed to break the homophily assumption, which disentangle the complex neighborhood components (Ma et al., 2019; Du et al., 2022) and model the edge diversity (Zhu et al., 2021a; Wang et al., 2022a ) by separately aggregating similar and dissimilar signals (Bo et al., 2021; Yan et al., 2021) . Despite achievements, heterophily GNNs are mostly investigated on classification tasks over static graphs while less explored on node-level regressions over dynamic graphs. Therefore, it provides an opportunity to dissect how edge-type disentanglement boosts regression capacity on dynamic graphs. Regression tasks are more challenging than classification as the latter only considers discrete labels with much tolerance (Wang et al., 2022b) . Actually, nodes in dynamic graphs are more prone to suffer complex neighborhood distributions (Ma et al., 2022) due to the existence of time-varying values and different edge types, incurring misleading message passing when aggregating target-deviated neighbors. The misleading message passing is formally designated as the topology-task discordance in our work (see Fig. 1(a) ). We take traffic volumes of road networks as an intuitive example of edge diversity in Fig. 1(b)-(c ). The neighboring intersections sharing an upstream-downstream connectivity can be positively correlated, while interactions locating parallelly on the same Origin-Destin transition tend to be negatively correlated with a contended relationship. These two-type edges respectively account for homophily and heterophily components and these correlations will also change over time with tidal patterns. Consequently, uniform aggregations on these two-type neighbors will involve interfered noise and deteriorate the performances of GNNs, as not all of them have consistent evolution direction towards targets. Empirically, in four real-world dynamic graphs, both low homophily ratios within intra-graph frames and across temporal adjacent frames (Fig. 1(d) ) imply that physically-connected nodes are not necessarily with close observationsfoot_0 or with same variation directions. This not only supports the argument of topology-task discordance, also manifests the universality of such phenomenon across different dynamic graphsfoot_1 . Furthermore, due to the heterogeneous local structures and neighborhood distributions, the topology-task discordance can be propagated to high-order neighborhoods. In other words, the optimal receptive fields should be adaptively and efficiently constructed to realize controllable neighborhood aggregation, thus avoiding noise involvement. Therefore, for dynamic graphs, remedying topology-task discordance in both immediate and high-order neighborhoods is urgently desired. Challenges. Based on the spatial-temporal property within node-wise relationships, adapting existing homophily theory to address such topology-task discordance is still challenging. The key obstacles can be summarized as 1) how to determine which pairs of nodes belong to homophily components without categorical labels, 2) how to involve targets to reconstruct node-wise correlations, thus materializing remedied and powerful target-oriented aggregations, 3) how to exploit the dynamic local neighborhood environments to achieve personalized high-order propagations. Present work. Our work empirically and theoretically elucidates the existence of topology-task discordance and explains the failure of homophily GNNs on dynamic graphs. To get rid of such dilemma, we propose a novel GNN to Remedy Topology-task discordance via Target-homophily (GReTo). Firstly, we extend the node-wise relations to a triple tuple based on signed-distance proximity, and construct two measures including intra-and inter-graph homophily to capture diverse spatial-temporal relations and overcome lacking categorical labels. Secondly, by introducing the target awareness with a transition homophily predictor, we incorporate two signed homophily measures to facilitate targetoriented message passing, renewing the activeness of GNNs. Finally, instead of imposing a nested or bi-level inefficient optimization (Xiao et al., 2021) , we devise an adaptive layer-wise importance measurement to promote the immediate neighborhood aggregation towards high-order propagations, by identifying the informativeness of each propagation step relative to the expected targets. We evaluate our solution on four dynamic graphs and successfully achieve 3.20% to 24.79% improvements against baselines on MAPE, where KnowAir (↑ 24.79%) with higher intra-graph negative heterophily ratios (Tab. 4) especially benefits from flexible signed message passing. Contributions. (1) We formalize a dynamic graph homophily theory, jointly characterizing multitype node-wise relations considering spatial-temporal property. (2) On dynamic graphs, we analyze the topology-task discordance and corresponding solution from its existence to the solution to personalized high-order propagation. (3) We propose GReTo, consisting of a signed target-oriented message passing and layer-importance based high-order propagation, to refine the topology adapting to downstream regression tasks.

2. RELATED WORK

GNNs have become an admirable tool of diverse graph-structured data mining (Song et al., 2022; Kipf & Welling, 2016; Abu-El-Haija et al., 2019) . To boost the representation capacity of traditional



Observations refer to the observed node features in the graph. Following graph construction on existing literature(Yu et al., 2018; Guo et al., 2019; Li et al., 2018), we establish the edges between two nodes by selecting top-5% geographically proximal nodes as neighbors.



Figure 1: Illustration of edge-type diversity and statistical graph homophily.

