COMBINING PHYSICS AND MACHINE LEARNING FOR NETWORK FLOW ESTIMATION

Abstract

The flow estimation problem consists of predicting missing edge flows in a network (e.g., traffic, power, and water) based on partial observations. These missing flows depend both on the underlying physics (edge features and a flow conservation law) as well as the observed edge flows. This paper introduces an optimization framework for computing missing edge flows and solves the problem using bilevel optimization and deep learning. More specifically, we learn regularizers that depend on edge features (e.g., number of lanes in a road, resistance of a power line) using neural networks. Empirical results show that our method accurately predicts missing flows, outperforming the best baseline, and is able to capture relevant physical properties in traffic and power networks.

1. INTRODUCTION

In many applications, ranging from road traffic to supply chains to power networks, the dynamics of flows on edges of a graph is governed by physical laws/models (Bressan et al., 2014; Garavello & Piccoli, 2006) . For instance, the LWR model describes equilibrium equations for road traffic Lighthill & Whitham (1955) ; Richards (1956) . However, it is often difficult to fully observe flows in these applications and, as a result, they rely on off-the-shelf machine learning models to make predictions about missing flows (Li et al., 2017; Yu et al., 2018) . A key limitation of these machine learning models is that they disregard the physics governing the flows. So, the question arises: can we combine physics and machine learning to make better flow predictions? This paper investigates the problem of predicting missing edge flows based on partial observations and the underlying domain-specific physics defined by flow conservation and edge features (Jia et al., 2019) . Edge flows depend on the graph topology due to a flow conservation law-i.e. the total inflow at every vertex is approximately its total out-flow. Moreover, the flow at an edge also depends on its features, which might regularize the space of possible flow distributions in the graph. Here, we propose a model that learns how to predict missing flows from data using bilevel optimization (Franceschi et al., 2017) and neural networks. More specifically, features are given as inputs to a neural network that produces edge flow regularizers. Weights of the network are then optimized via reverse-mode differentiation based on a flow estimation loss from multiple train-validation pairs. Our work falls under a broader effort towards incorporating physics knowledge to machine learning, which is relevant for natural sciences and engineering applications where data availability is limited (Rackauckas et al., 2020) . Conservation laws (of energy, mass, momentum, charge, etc.) are essential to our understanding of the physical world. The classical Noether's theorem shows that such laws arise from symmetries in nature (Hanc et al., 2004) . However, flow estimation, which is an inverse problem (Tarantola, 2005; Arridge et al., 2019) , is ill-posed under conservation alone. Regularization enables us to apply domain-knowledge in the solution of inverse problems. We motivate our problem and evaluate its solutions using two application scenarios. The first is road traffic networks (Coclite et al., 2005) , where vertices represent locations, edges are road segments, flows are counts of vehicles that traverse a segment and features include numbers of lanes and speed limits. The second scenario is electric power networks (Dörfler et al., 2018) , where vertices represent power buses, edges are power lines, flows are amounts of power transmitted and edge features include resistances and lengths of lines. Irrigation channels, gas pipelines, blood circulation, supply chains, air traffic, and telecommunication networks are other examples of flow graphs. Our contributions can be summarized as follows: (1) We introduce a missing flow estimation problem with applications in a broad class of flow graphs; (2) we propose a model for flow estimation that is able to learn the physics of flows by combining reverse-mode differentiation and neural networks; (3) we show that our model outperforms the best baseline by up to 18%; and (4) we provide evidence that our model learns interpretable physical properties, such as the role played by resistance in a power transmission network and by the number of lanes in a road traffic network.

2. FLOW ESTIMATION PROBLEM

We introduce the flow estimation problem, which consists of inferring missing flows in a network based on a flow conservation law and edge features. We provide a list of symbols in the Appendix. Flow Graph. Let G(V, E, X ) be a flow graph with vertices V (n = |V|), edges E (m = |E|), and edge feature matrix X ∈ R m×d , where X [e] are the features of edge e. A flow vector f ∈ R m contains the (possibly noisy) flow f e for each edge e ∈ E. In case G is directed, f ∈ R m + , otherwise, a flow is negative if it goes against the arbitrary orientation of its edge. We assume that flows are induced by the graph, and thus, the total flow-in plus out-at each vertex is approximately conserved: (vi,u)∈E f (vi,u) ≈ (u,vo)∈E f (u,vo) , ∀u ∈ V In the case of a road network, flow conservation implies that vehicles mostly remain on the road. Flow Estimation Problem. Given a graph G(V, E, X ) with partial flow observations f ∈ R m for a subset E ⊆ E of edges ( fe is the flow for e ∈ E , m = |E | < m), predict flows for edges in E \ E . In our road network example, partial vehicle counts f might be measured by sensors placed at a few segments, and the goal is to estimate counts at the remaining segments. One would expect flows not to be fully conserved in most applications due to the existence of inputs and outputs, such as parking lots and a power generators/consumers. In case these input and output values are known exactly, they can be easily incorporated to our problem as flow observations. Moreover, if they are known approximately, we can apply them as priors (as will be detailed in the next section). For the remaining of this paper, we assume that inputs and outputs are unknown and employ flow conservation as an approximation of the system. Thus, different from classical flow optimization problems, such as min-cost flow (Ahuja et al., 1988) , we assume that flows are conserved approximately. Notice that our problem is similar to the one studied in Jia et al. (2019) . However, while their definition also assumes flow conservation, it does not take into account edge features. We claim that these features play important role in capturing the physics of flows. Our main contribution is a new model that is able to learn how to regularize flows based on edge features using neural networks.

3. OUR APPROACH: PHYSICS+LEARNING

In this section, we introduce our approach for the flow estimation problem, which is summarized in Figure 1 . We formulate flow estimation as an optimization problem (Section 3.1), where the interplay between the flow network topology and edge features is defined by the physics of flow graphs. Flow estimation is shown to be equivalent to a regularized least-squares problem (Section

