OFFER PERSONALIZATION USING TEMPORAL CONVO-LUTION NETWORK AND OPTIMIZATION

Abstract

Lately, personalized marketing has become important for retail/e-retail firms due to significant rise in online shopping and market competition. Increase in online shopping and high market competition has led to an increase in promotional expenditure for online retailers, and hence, rolling out optimal offers has become imperative to maintain balance between number of transactions and profit. In this paper, we propose our approach to solve the offer optimization problem at the intersection of consumer, item and time in retail setting. To optimize offer, we first build a generalized non-linear model using Temporal Convolutional Network to predict the item purchase probability at consumer level for the given time period. Secondly, we establish the functional relationship between historical offer values and purchase probabilities obtained from the model, which is then used to estimate offer-elasticity of purchase probability at consumer item granularity. Finally, using estimated elasticities, we optimize offer values using constraint based optimization technique. This paper describes our detailed methodology and presents the results of modelling and optimization across categories.

1. INTRODUCTION

In most retail settings, promotions play an important role in boosting the sales and traffic of the organisation. Promotions aim to enhance awareness when introducing new items, clear leftover inventory, bolster customer loyalty, and improve competitiveness. Also, promotions are used on a daily basis in most retail environments including online retailers, supermarkets, fashion retailers, etc. A typical retail firm sells thousands of items in a week and needs to design offer for all items for the given time period. Offer design decisions are of primary importance for most retail firms, as optimal offer roll out can significantly enhance the business' bottom line. Most retailers still employ a manual process based on intuition and past experience of the category managers to decide the depth and timing of promotions. The category manager has to manually solve the promotion optimization problem at consumer-item granularity, i.e., how to select an optimal offer for each period in a finite horizon so as to maximize the retailer's profit. It is a difficult problem to solve, given that promotion planning process typically involves a large number of decision variables, and needs to ensure that the relevant business constraints or offer rules are satisfied. The high volume of data that is now available to retailers presents an opportunity to develop machine learning based solutions that can help the category managers improve promotion decisions. In this paper, we propose deep learning with multi-obective optimization based approach to solve promotion optimization problem that can help retailers decide the promotions for multiple items while accounting for many important modelling aspects observed in retail data. The ultimate goal here is to maximize net revenue and consumer retention rate by promoting the right items at the right time using the right offer discounts at consumer-item level. Our contributions in this paper include a) Temporal Convolutional Neural Network architecture with hyperparameter configurations to predict the item purchase probability at consumer level for the given time period. b) Design and implementation of F 1 -maximization algorithm which optimises for purchase probability cut-off at consumer level. c) Methodology to estimate offer elasticity of purchase probability at consumer item granularity. d) Constraint based optimization technique to estimate optimal offers at consumer-item granularity.

2. RELATED WORK

There has been a significant amount of research conducted on offer-based revenue management over the past few decades. Multiple great works have been done to solve Promotion Optimization problem. One such work is Cohen et al. (2017) , where the author proposes general classes of demand functions (including multiplicative and additive), which incorporates post-promotion dip effect, and uses Linear integer programming to solve Promotion Optimization problem. In one of the other work [Cohen & Perakis (2018) ], the author lays out different types of promotions used in retail, and then formulates the promotion optimization problem for multiple items. In paper Cohen & Perakis (2018) , the author shows the application of discrete linearization method for solving promotion optimization. Gathering learnings from the above papers, we create our framework for offer optimization. The distinguishing features of our work in this field include (i) the use of a nonparametric neural network based approach to estimate the item purchase probability at consumer level, (ii) the establishment of the functional relationship between historical offer values and purchase probabilities, and (iii) the creation of a new model and efficient algorithm to set offers by solving a multi-consumer-item promotion optimization that incorporates offer-elasticity of purchase probability at a reference offer value

3. METHODOLOGY

We built seperate models for each category, as we understand that consumer purchase pattern and personalized marketing strategies might vary with categories.

3.1. MODELLING

In our model setup, we treat each relevant consumer-item as an individual object and shape them into bi-weekly time series data based on historical transactions, where the target value at each time step (2 weeks) takes a binary input, 1/0 (purchased/non purchased). Relevancy of the consumeritem is defined by items transacted by consumer during training time window. Our Positive samples (purchased/1) are time steps where consumer did transact the item, whereas Negative samples (non purchased/0) are the time steps where the consumer did not buy that item. We apply sliding windows testing routine for generating out of time results. The time series is split into 3 parts -train (48 weeks), validation (2 weeks) and test (2 weeks). All our models are built in a multi-object fashion for an individual category, which allows the gradient movement to happen across all consumer-item combinations split in batches. This enables cross-learning to happen across consumers/items. A row in time series is represented by  y cit = h(i t ,



c t , .., c t-n , ic t , .., ic t-n , d t , .., d t-n ) (1) where y cit is purchase prediction for consumer 'c' for item 'i' at time 't'. 'n' is the number of time lags. i t denotes attributes of item 'i' like category, department, brand, color, size, etc at time 't'. c t denotes attributes of consumer 'c' like age, sex and transactional attributes at time 't'. c t-n denotes the transactional attributes of consumer 'c' at a lag of 't-n' time steps. ic t denotes transactional attributes such as basket size, price, offer, etc. of consumer 'c' towards item 'i' at time 't' . d t is derived from datetime to capture trend and seasonality at time 't'. We use transactional metrics at various temporal cuts like week, month, etc. Datetime related features capturing seasonality and trend are also generated. Consumer-Item Profile: We use transactional metrics at different granularities like consumer, item, and consumer-item. We also create features like Time since first order, Time since last order, time gap between orders, Reorder rates, Reorder frequency, Streak -user purchased the item in a row, Average position in the cart, Total number of orders. Price/Promotions: We use relative price and historical offer discount percentage to purchase propensity at varying price, and discount values. Lagged Offsets: We use statistical rolling operations like mean, median, variance, kurtosis and skewness over temporal regressors for different lag periods to generate offsets.

