SCALED NEURAL MULTIPLICATIVE MODEL FOR TRACTABLE OPTIMIZATION

Abstract

Challenging decision problems in retail and beyond are often solved using the predict-then-optimize paradigm. An initial effort to develop and parameterize a model of an uncertain environment is followed by a separate effort to identify the best possible solution of an optimization problem. Linear models are often used to ensure optimization problems are tractable. Remarkably accurate Deep Neural Network (DNN) models have recently been developed for various prediction tasks. Such models have been shown to scale to large datasets without loss of accuracy and with good computational performance. It can, however, be challenging to formulate tractable optimization problems based on DNN models. In this work we consider the problem of shelf space allocation for retail stores using DNN models. We highlight the trade-off between predictive performance and the tractability of optimization problems. We introduce a Scaled Neural Multiplicative Model (SNMM) with shape constraints for demand learning that leads to a tractable optimization formulation. Although, this work focuses on a specific application, the formulation of the models are general enough such that they can be extended to many real world applications.

1. INTRODUCTION

The predict-then-optimize framework is ubiquitous in applied research. A predictive model is first developed to approximate the true dynamics of a system under consideration to a given level of accuracy. A mathematical programming formulation based on the predictive model is then used to help researchers identify optimal policies for challenging real-world decision problems. Despite the fact that predictive models that are estimated based on the historical data need not be causal, there are recent works that shed some light on the principles behind this approach (Bertsimas & Kallus, 2020) . An alternative to the predict-then-optimize framework is integrating the two stages, predicting and optimizing at the same time, by utilizing a specific loss function in prediction models (Elmachtoub & Grigas, 2022) . For example, model-free Reinforcement learning (RL) approaches explore their environment while also exploiting it to make optimal decisions. Big-box retailers have thousands of stores located across the country. They use data-driven decision to optimize operations; e.g, assortment planning, pricing, and supply chain optimization. They work on problems related to shelf space allocation, making the best use of limited space in stores. Space planning for apparel is particularly challenging, due to the different shapes and sizes of the merchandise and the temporal shifts in brand importance. The problem that motivated our work in this area involves deciding how much space in terms of fixtures to assign to each of several brands within a category of products for sale. These problems are typically solved at the department or store level. Products are arranged into departments and categories; for example, 'activewear tops' is a category within the women's apparel department. In this work, we introduce a novel approach based on the predict-then-optimize framework for solving a shelf space allocation problem. In particular our contributions are as follows • We begin by identifying certain key characteristics of relevant real-world data that make modeling challenging. • We discuss predictive model selection considering the tractability of resulting optimization problem formulations and focus on the convexity of the formulations. • We propose multiplicative models and establish conditions for tractability of the optimization. We also discuss why family of linear models are not suitable for the given application. • We hypothesize that it is possible to convert an intractable optimization to a tractable one without loss of prediction accuracy. We achieve this via a Scaled Neural Multiplicative Model (SNMM) and demonstrate that our proposed model performs well relative to alternative models.

2. RELATED WORK

The relationship between the shelf space allocated to a product and that product's sales has been studied extensively (Bianchi-Aguiar et al., 2021; Hübner & Kuhn, 2012; Karampatsa et al., 2017) . Analyses often focus on incremental returns, estimating the space elasticity. Under the assumption of diminishing returns to scale, the relationship between shelf space and sales can be modeled using a concave function (Curhan, 1972; Eisend, 2014) . The assumption of diminishing returns makes intuitive sense and has been widely used in production functions (Gopalswamy & Uzsoy, 2019)(Aigner & Chu, 1968); the incremental gain in sales by adding more space to showcase a product decreases as the space allocated increases. This assumption can also have a side benefit of making shelf space allocation optimization problems easier to solve. Optimization: There have been a number of scientific articles focused on the assortment optimization problem; a problem related to the one we consider in this work. Retailers solve this problem to determine which products to sell. Shelf space in stores is often the most prominent constraint. Kök & Fisher (2007 ), Yücel et al. (2009 ), Lo (2019) , and numerous others focus on developing optimization frameworks that include complicated consumer choice models. This allows the authors to select the optimal product mix accounting for the fact that some consumers will substitute one product for another depending on what is and is not sold by a retailer. Our problem is somewhat different in that we are not selecting specific products to sell, but rather deciding which brands to carry and how much space to allocate to these brands. These brands will offer products that are relatively unique within a store. Hübner & Kuhn (2011) points out the relationships between space allocated, consumer demand, and inventory costs. Assuming a fixed amount of space available, a retailer can choose to offer fewer facings of a more diverse set of products to increase consumer interest. This will, however, also increase inventory holding and replenishment costs. There will be increasing demands placed on store labor. Ryzin & Mahajan (1999) came to a similar conclusion earlier looking specifically at apparel. Our problem is, again, somewhat different. Brand managers are responsible for selecting product mix within brands sold in specific stores, managing inventory costs and the tradeoff between such costs and expected sales or profits. Shape-Constrained Models: An integral part of our prediction model that makes the optimization tractable is shape constraints. These are models that are constrained to be monotonic, convex, concave or non-negative among others. Shape constraints provide effective regularization, reducing the chance that noisy training data or adversarial examples produce a model that does not behave as expected. These impose strong priors on the data and can be used effectively to produce well behaved structured prediction models. It is this property that we enforce in our prediction model to yield a tractable optimization. Shape constraints on neural networks have been studied in different applications (Gupta et al., 2018) . Most of the prior works consider GAMs (Generalized Additive Models) and neural networks separately. In our work, we combine the strength of neural networks and the simplicity of GAMs and propose a Scaled Neural Multiplicative model that can model concave, convex or monotone constraints effectively. Differentiable Optimization: Recent works have introduced classes of deep learning models where certain layers involves solving optimization problems (Donti et al., 2017; Agrawal et al., 2019; Wilder et al., 2019) . In these differentiable optimization architectures, backpropagation is based on a loss function that reflects the decision problem(s). The goal is to directly learn a policy that performs

