INDUCTIVE REPRESENTATION LEARNING IN TEMPO-RAL NETWORKS VIA CAUSAL ANONYMOUS WALKS

Abstract

Temporal networks serve as abstractions of many real-world dynamic systems. These networks typically evolve according to certain laws, such as the law of triadic closure, which is universal in social networks. Inductive representation learning of temporal networks should be able to capture such laws and further be applied to systems that follow the same laws but have not been unseen during the training stage. Previous works in this area depend on either network node identities or rich edge attributes and typically fail to extract these laws. Here, we propose Causal Anonymous Walks (CAWs) to inductively represent a temporal network. CAWs are extracted by temporal random walks and work as automatic retrieval of temporal network motifs to represent network dynamics while avoiding the time-consuming selection and counting of those motifs. CAWs adopt a novel anonymization strategy that replaces node identities with the hitting counts of the nodes based on a set of sampled walks to keep the method inductive, and simultaneously establish the correlation between motifs. We further propose a neural-network model CAW-N to encode CAWs, and pair it with a CAW sampling strategy with constant memory and time cost to support online training and inference. CAW-N is evaluated to predict links over 6 real temporal networks and uniformly outperforms previous SOTA methods by averaged 15% AUC gain in the inductive setting. CAW-N also outperforms previous methods in 5 out of the 6 networks in the transductive setting.

1. INTRODUCTION

Temporal networks consider dynamically interacting elements as nodes, interactions as temporal links, with labels of when those interactions happen. Such temporal networks provide abstractions to study many real-world dynamic systems (Holme & Saramäki, 2012) . Researchers have investigated temporal networks in recent several decades and concluded many insightful laws that essentially reflect how these real-world systems evolve over time (Kovanen et al., 2011; Benson et al., 2016; Paranjape et al., 2017; Zitnik et al., 2019) . For example, the law of triadic closure in social networks, describing that two nodes with common neighbors tend to have a mutual interaction later, reflects how people establish social connections (Simmel, 1950) . Later, a more elaborate law on the correlation between the interaction frequency between two individuals and the degree that they share social connections, further got demonstrated (Granovetter, 1973; Toivonen et al., 2007) . Feedforward control loops that consist of a direct interaction (from node w to node u) and an indirect interaction (from w through another node v to u), also work as a law in the modulation of gene regulatory systems (Mangan & Alon, 2003) and also as the control principles of many engineering systems (Gorochowski et al., 2018) . Although research on temporal networks has achieved the above success, it can hardly be generalized to study more complicated laws: Researchers have to investigate an exponentially increasing number of patterns when incorporating more interacting elements let alone their time-evolving aspects. Recently, representation learning, via learning vector representations of data based on neural networks, has offered unprecedented possibilities to extract, albeit implicitly, more complex structural patterns (Hamilton et al., 2017b; Battaglia et al., 2018) . However, as opposed to the study on static networks, representation learning of temporal networks is far from mature. Two challenges on temporal networks have been frequently discussed. First, the entanglement of structural and temporal 𝑡 3 𝑤 Triadic closure (𝑡 1 , 𝑡 2 < 𝑡 3 ) Law explanation: Two nodes that share a common node directly connected to them tend to be connected. u w v w 𝑢 𝑣 𝑤 𝑢 𝑣 𝑤 Feed-forward control (𝑡 1 < 𝑡 2 < 𝑡 3 ) 𝑢 𝑣 𝑤 𝑢 𝑣 𝑡 1 𝑡 2 𝑡 1 𝑡 2 𝑡 1 𝑡 2 𝑡 1 𝑡 2 𝑡 3 𝑡 1 𝑡 2 Law explanation: One node (w) activates another node first (u) and then activates a third node (v). Mostly the third node will inhibit the second node later. u w v w 𝑡 1 𝑡 2 CAWs extracted: CAWs extracted: , , x patterns required an elegant model to digest the two-side information. Second, the model scalability becomes more crucial over temporal networks as new arriving links need to be processed timely while a huge link set due to the repetitive links between two nodes needs to be digested simultaneously. 𝑆 𝑣 : 𝑆 𝑢 : 0, 2, 1, 0 𝑇 𝑡 𝑥 u b a c In contrast to the above two challenges, another challenge, the inductive capability of the temporalnetwork representation, is often ignored. However, it is equally important if not more, as the inductive capability indicates whether the models indeed capture the dynamic laws of the systems and can be further generalized to the system that share the same laws but have not been used to train these models. These laws may only depend on structures such as the triadic closure or feed-forward control loops as aforementioned. These laws may also correlate with node attributes, such as interactions between people affected by their gender and age (Kovanen et al., 2013) . But in both cases, the laws should be independent from network node identities. Although previous works tend to learn inductive models by removing node identities (Trivedi et al., 2019; Xu et al., 2020) , they run into other issues to inductively represent the dynamic laws, for which we leave more detailed discussion in Sec. 2. Here we propose Causal Anonymous Walks (CAW) for modeling temporal networks. Our idea for inductive learning is inspired by the recent investigation on temporal network motifs that correspond to connected subgraphs with links that appear within a restricted time range (Kovanen et al., 2011; Paranjape et al., 2017) . Temporal network motifs essentially reflect network dynamics: Both triadic closure and feed-forward control can be viewed as temporal network motifs evolving (Fig. 1 ); An inductive model should predict the 3rd link in both cases when it captures the correlation of these two links as they share a common node, while the model is agnostic to the node identities of these motifs. Our CAW model has two important properties (Fig. 2): (1) Causality extraction -a CAW starts from a link of interest and backtracks several adjacent links over time to encode the underlying causality of network dynamics. Each walk essentially gives a temporal network motif; (2) Set-based anonymization -CAWs remove the node identities over the walks to guarantee inductive learning while encoding relative node identities based on the counts that they appear at a certain position according to a set of sampled walks. Relative node identities guarantee that the structures of motifs and their correlations are still kept after removing node identities. To predict temporal links between two nodes of interest, we propose a model CAW-Network (CAW-N) that samples a few CAWs related to the two nodes of interest, encodes and aggregates these CAWs via RNNs (Rumelhart et al., 1986) and set-pooling respectively to make the prediction.



Figure 1: Triadic closure and feed-forward loops: Causal anonymous walks (CAW) capture the laws.

Figure 2: Causal anonymous walks (CAW): causality extraction and set-based anonymization.

