GEOVEX: GEOSPATIAL VECTORS WITH HEXAGONAL CONVOLUTIONAL AUTOENCODERS

Abstract

We introduce a new geospatial representation model called GeoVeX to learn global vectors for all geographical locations on Earth land cover. GeoVeX is built on a novel model architecture named Hexagonal Convolutional Autoencoders (HCAE) combined with a Zero-Inflated Poisson (ZIP) reconstruction layer, applied to a grid of Uber's H3 hexagons, each one described by the histogram of OpenStreetMap (OSM) geographical tags occurrences. GeoVeX is novel on two aspects: first, it produces pre-trained task-agnostic geospatial vectors with H3 and OSM that are, for the first time, contextualized on the neighboring hexagons features, by leveraging an hexagonal convolutional autoencoder applied on an H3/OSM grid centered on the location to embed; secondly, it introduces a zeroinflated Poisson autoencoder reconstruction layer, to adapt a standard autoencoder network to train on sparse geographical count data distributed on an hexagonal grid. Experiments demonstrate that GeoVeX embeddings improve upon two stateof-the-art geospatial location representations models, Hex2Vec and Space2Vec, on two different downstream tasks: worldwide listings price prediction in the travel industry, and hyperlocal interpolation of climate data from weather stations. The qualitative analysis of the latent representation structures learnt by GeoVeX showcases the higher quality of the geographical structures learnt by the geographically contextualized embeddings learnt by GeoVeX.

1. INTRODUCTION

Entity embedding is ubiquitous in a variety of Machine Learning tasks thanks to its many advantages: it captures the semantics of each entity in the context of a given domain; it enables transfer learning to different related tasks; it reduces the sparsity of the entity representation and compresses the feature space. In NLP domain, global word embedding models, such as Word2Vec (Mikolov et al., 2013 ), GloVe (Pennington et al., 2014) and BERT (Devlin et al., 2019) have been successful at capturing the word semantics of big open-source vocabularies (e.g. Wikipedia, Gigaword) and are used to transfer learning to multiple downstream tasks, such as sentiment analysis (Tang et al., 2014; Deho et al., 2018; Alamoudi & Alghamdi, 2021 ), question retrieval (Zhou et al., 2015) , and medical semantics (Wang et al., 2018) . Similar approaches inspired by NLP have been since then proved to be useful in many industrial domains, where multiple models have been proposed for learning the latent representations of entities specific to an industry, such as Product2Vec In comparison, in the field of Geographic Information Science (GIS), a global set of task-agnostic embeddings for geographical space representation can benefit multiple domains and use cases, such as: price prediction for houses (Wang et al., 2021 ), hotel rooms (Kisilevich et al., 2013) , and vacation homes (Islam et al., 2022; Pradip & Suthar, 2022) ; interpolation of climate variables such as temperature and pressure (Wu & Li, 2013) ; computer vision tasks with geo-located images (Berg et al., 2014) . These tasks, just to name a few, have in common the application of some transformations to the spatial coordinates, but they do not leverage the spatial distribution of geo entities (such as parks, water, beach, buildings, streets, bars, etc.), which convey a more rich information of the geographical context. Besides, in terms of modelling, previous approaches to learn geospatial embeddings have a set of limitations, such as being non-contextual, task-specific and/or region-specific (Sec. 2) that we address with a novel model architecture and loss function formulation (Sec. 3.6).



(Biswas et al., 2017)   andUser2Vec (Hallac et al., 2019) in e-commerce, or Wave2Vec(Baevski et al., 2020)  in speech representation, just to name a few.

