MASTERING SPATIAL GRAPH PREDICTION OF ROAD NETWORKS

ABSTRACT

Accurately predicting road networks from satellite images requires a global understanding of the network topology. We propose to capture such high-level information by introducing a graph-based framework that simulates the addition of sequences of graph edges using a reinforcement learning (RL) approach. In particular, given a partially generated graph associated with a satellite image, an RL agent nominates modifications that maximize a cumulative reward. As opposed to standard supervised techniques that tend to be more restricted to commonly used surrogate losses, these rewards can be based on various complex, potentially noncontinuous, metrics of interest. This yields more power and flexibility to encode problem-dependent knowledge. Empirical results on several benchmark datasets demonstrate enhanced performance and increased high-level reasoning about the graph topology when using a tree-based search. We further highlight the superiority of our approach under substantial occlusions by introducing a new synthetic benchmark dataset for this task.

1. INTRODUCTION

Road layout modelling from satellite images constitutes an important task of remote sensing, with numerous applications in and navigation. The vast amounts of data available from the commercialization of geospatial data, in addition to the need for accurately establishing the connectivity of roads in remote areas, have led to an increased interest in the precise representation of existing road networks. By nature, these applications require structured data types that provide efficient representations to encode geometry, in this case, graphs, a de facto choice in domains such as computer graphics, virtual reality, gaming, and the film industry. These structured-graph representations are also commonly used to label recent road network datasets (Van Etten et al., 2018) and map repositories (OpenStreetMap contributors, 2017) . Based on these observations, we propose a new method for generating predictions directly as spatial graphs, allowing us to explicitly incorporate geometric constraints in the learning process, encouraging predictions that better capture higher-level dataset statistics. In contrast, existing methods for road layout detection, mostly rely on pixel-based segmentation models that are trained on masks produced by rasterizing ground truth graphs. Performing pixelwise segmentation, though, ignores structural features and geometric constraints inherent to the



Figure 1: Our agent interacts with the currently generated spatial graph by proposing new edges to be added. A tree-based search produces a sequence of actions that maximizes a reward function based on complex geometrics priors.

