LEARNING A LATENT SEARCH SPACE FOR ROUTING PROBLEMS USING VARIATIONAL AUTOENCODERS

Abstract

Methods for automatically learning to solve routing problems are rapidly improving in performance. While most of these methods excel at generating solutions quickly, they are unable to effectively utilize longer run times because they lack a sophisticated search component. We present a learning-based optimization approach that allows a guided search in the distribution of high-quality solutions for a problem instance. More precisely, our method uses a conditional variational autoencoder that learns to map points in a continuous (latent) search space to highquality, instance-specific routing problem solutions. The learned space can then be searched by any unconstrained continuous optimization method. We show that even using a standard differential evolution search strategy our approach is able to outperform existing purely machine learning based approaches.

1. INTRODUCTION

Significant progress has been made in learning to solve optimization problems via machine learning (ML). Especially for practical applications, learning-based approaches are of great interest because of the high labor costs associated with the development of completely hand-crafted solution approaches. For routing problems such as the traveling salesperson problem (TSP) and the capacitated vehicle routing problem (CVRP), recent ML-based approaches are able to generate good solutions for small problem instances in a fraction of a second (e.g., Kool et al. (2019) ). However, in many real-world applications of these problems users gladly accept more computation time for solutions of even higher quality. Recently proposed approaches (e.g., Hottung & Tierney (2020)) address this demand and integrate learning-based components with high-level search procedures. While these approaches offer improved performance over non-search-based methods, they rely on domain knowledge encapsulated in the high-level search procedures. In this work, we present a learning-based optimization approach for routing problems that is able to perform an extensive search for high-quality solutions. In contrast to other approaches, our method does not rely on domain-specific high-level search procedures. Our approach learns an instancespecific mapping of solutions to a continuous search space that can then be searched via any existing continuous optimization method. We use a conditional variational autoencoder (CVAE) that learns to encode a solution to a given instance as a numerical vector and vice versa. Some genetic algorithm variants (e.g., Gonc ¸alves & Resende (2012)) use numerical vectors to represent solutions to combinatorial optimization problems. However, these approaches rely on decoding schemes that are carefully handcrafted by domain experts. In contrast, our approach learns the problem-specific decoding schema on its own, requiring no domain or optimization knowledge on the side of the user. The performance of an optimization algorithm heavily depends on the structure of the fitness landscape of the search space, such as its smoothness. If solutions close to each other in the search space are semantically similar, resulting in a smooth landscape, the employed search algorithm can iteratively move towards the more promising areas of the search space. It has been observed for some problems that variational autoencoders (VAEs) are capable of learning a latent space in which semantically similar inputs are placed in the same region. This allows, for example, a semantically meaningful interpolation between two points in the latent space (see e.g. Berthelot et al. (2018) ). However, it is unclear if this property upholds for a conditional latent space that encodes routing problems. We show experimentally that our CVAE-based approach is indeed capable of learning a latent search space in which neighboring solutions have a similar objective function value. Furthermore, we introduce a novel technique that addresses the issue of symmetries in the latent space and show that it enables our method to match and surpass state-of-the-art ML-based methods. We train our method using high-quality solutions because we aim to learn a latent search space that contains mostly high-quality solutions. Hence, our method usually requires a long offline phase (e.g., to generate solutions using a slow, domain-independent, generic solver). However, this offline phase is offset by fast, online solution generation. We focus on the TSP and the CVRP, which are two of the most well-researched problems in the optimization literature. The TSP is concerned with finding the shortest tour between a set of cities that visits each city exactly once and returns to the starting city. The CVRP describes a routing problem where the routes for multiple vehicles to a set of customers must be planned. All customers have a certain demand of goods and all vehicles have a maximum capacity that they can carry. All routes must start and end at the depot. The task is to find a set of routes with minimal cost so that the demand of all customers is fulfilled and each customer is visited by exactly one vehicle. We consider the versions of the TSP and CVRP where the distance matrix obeys the triangle inequality. The contributions of this work are as follows: • We propose a novel approach that learns a continuous, latent search space for routing problems based on CVAEs. • We show that our approach is able to learn a well-structured latent search space. • We show that the learned search space enables a standard differential evolution search strategy to outperform state-of-the-art ML methods.

2. RELATED WORK

In Hopfield & Tank (1985) , it was first proposed to use an ML-based method to solve a routing problem. The authors use a Hopfield network to solve small TSP instances with up to 30 cities. In Vinyals et al. (2015) , pointer networks are proposed and trained to solve TSP instances with up to 50 cities using supervised learning. Bello et al. (2016) (2019) , in which a recurrent neural network decoder coupled with an attention mechanism and a graph attention network are used, respectively. While some of these methods use a high-level search procedure (such as beam search), all of them are focused on finding solutions quickly (in under one second). In contrast, our approach is able to exploit a longer runtime (more than one minute for larger instances) to find solutions of better quality. A couple of approaches use local search like algorithms combined with ML techniques to solve routing problems. Chen & Tian (2019) propose to learn an improvement operator that makes small changes to an existing solution. The operator is applied to a solution iteratively to find a highquality solutions for the CVRP. However, with a reported runtime of under half a second for the CVRP with 100 nodes, the method is not focused on performing an extensive search. In Hottung & Tierney (2020), another iterative improvement method for the CVRP is proposed that integrates learned heuristics into a large neighborhood search framework. The method is used to perform an extensive search with reported runtimes of over one minute for larger instances. In contrast to our method, the high-level large neighborhood search framework contains domain specific components and is known to perform exceptionally well on routing problems (Ropke & Pisinger, 2006) . Perhaps most similar to our work is the line of research based on Gómez-Bombarelli et al. (2018) , in which the authors use a VAE to learn a continuous latent search space for discovering molecules.



extend this idea and train a pointer network via actor-critic reinforcement learning. More recently, graph neural networks have been used to solve the TSP, e.g., a graph embedding network in Khalil et al. (2017), a graph attention network in Deudon et al. (2018), or a graph convolutional network in Joshi et al. (2019). The significantly more complex CVRP has first been addressed in Nazari et al. (2018) and Kool et al.

