DIRECT EMBEDDING OF TEMPORAL NETWORK EDGES VIA TIME-DECAYED LINE GRAPHS

Abstract

Temporal networks model a variety of important phenomena involving timed interactions between entities. Existing methods for machine learning on temporal networks generally exhibit at least one of two limitations. First, many methods assume time to be discretized, so if the time data is continuous, the user must determine the discretization and discard precise time information. Second, edge representations can only be calculated indirectly from the nodes, which may be suboptimal for tasks like edge classification. We present a simple method that avoids both shortcomings: construct the line graph of the network, which includes a node for each interaction, and weigh the edges of this graph based on the difference in time between interactions. From this derived graph, edge representations for the original network can be computed with efficient classical methods. The simplicity of this approach facilitates explicit theoretical analysis: we can constructively show the effectiveness of our method's representations for a natural synthetic model of temporal networks. Empirical results on real-world networks demonstrate our method's efficacy and efficiency on both link classification and prediction.

1. INTRODUCTION

Temporal networks, which are graphs augmented with a time value for each edge, model a variety of important phenomena involving timed interactions between entities, including financial transactions, flights, and web browsing. Common tasks for machine learning on temporal networks include classification of the temporal edges, as well as temporal link prediction, which involves predicting future links given some links in the past. These tasks yield various applications, such as recommendation systems (Zhou et al., 2021) and detection of illicit financial transactions (Pareja et al., 2020) . As with most machine learning for graphs, the key to learning for temporal networks is creating effective vector representations, also called embeddings, for the network's components, namely the nodes and edges. These embeddings can either be made as part of an end-to-end framework, or created then passed to off-the-shelf classifiers for downstream tasks; for example, for the edge classification task, a logistic regression classifier can be trained using the training edges' embedding vectors and class labels, then applied at inference time on the test edges' vectors. The node embedding task, in particular, has seen great interest, and many methods for 'static' (i.e., non-temporal) networks have been proposed over the years (Belkin & Niyogi, 2001; Perozzi et al., 2014; Grover & Leskovec, 2016) . Edge embedding has seen less interest; there are some exceptions (Li et al., 2017b; Bandyopadhyay et al., 2019) , but generally, edge embeddings are created by first making node embeddings, then aggregating them, e.g., by taking the entrywise product of the two endpoint nodes' embeddings. There are a wide variety of methods for temporal network embedding, including ones based on matrix/tensor factorization (Dunlavy et al., 2011; Li et al., 2017a; Zhang et al., 2018 ), random walks (Yu et al., 2018; Nguyen et al., 2018) , graph convolutional networks (Pareja et al., 2020), and deep autoencoders (Goyal et al., 2018; Rahman et al., 2018; Goyal et al., 2020) , but generally, these methods can be seen as variants of those for static networks. These prior methods overall present some limitations, which we now summarize. With some exceptions, especially in more recent work (Nguyen et al., 2018; Trivedi et al., 2019; Rossi et al., 1 

