MULTI-BEHAVIOR DYNAMIC CONTRASTIVE LEARN-ING FOR RECOMMENDATION

Abstract

Dynamic behavior modeling has become an essential task in personalized recommender systems for learning the time-evolving user preference in online platforms. However, most next-item recommendation methods follow the single type behavior learning manner, which notably limits their user representation performance in reality, since the user-item relationships are often multi-typed in real-life applications (e.g., click, tag-as-favorite, review and purchase). To offer better recommendations, this work proposes Evolving Graph Contrastive Memory Network (EGCM) to model dynamic interaction heterogeneity for multi-behavior sequential recommendation. Specifically, we first develop a multi-behavior graph encoder to capture the short-term preference heterogeneity, and preserve the dedicated relation semantics for different types of user-item interactions. In addition, we design a dynamic cross-relational memory network, empowering EGCM to distill the long-term multi-behavior preference of users and the underlying evolving cross-type behavior dependencies over time. To obtain robust and informative user representation with multi-behavior commonality and diversity, we design a multi-behavior contrastive learning paradigm with heterogeneous short-and long-term interest modeling, and provides theoretical analyses to support the modeling of commonality and diversity. Experiments on several real-world datasets show the superiority of our recommender system over various state-of-the-art baselines.

1. INTRODUCTION

Learning user's dynamic preference plays a vital role in recommender systems to predict the next items that users may be interested in Wang et al. (2019) . For example, a family may buy chicken and bread on an online platform for a long time because of their daily needs, and also buy turkeys close to Christmas. The recent advances of neural network architectures has inspired many efforts to model the transitions between temporally-ordered items, due to the strong representation capability of deep learning techniques, e.g., recurrent neural encoder Hidasi et al. ( 2016 In real-life recommendation scenarios, users often interact with items in various ways, based on their interests which are intrinsically time-evolving and diverse. For instance, different types of user behaviors (e.g., page view, add-to-favorite, purchase) in online retailers may reflect diverse user intentions and heterogeneous user-item relationships Guo et al. (2019); Jin et al. (2020b) . Leaving this fact untouched, single type of behavior modeling in previous chronological user embedding functions is insufficient to comprehensively capture diverse user intents with behavior heterogeneity Xia et al. (2021a) . Hence, time-evolving multi-behavior representations can characterize the various latent factors behind user-item interactions, and maintain dedicated embedding space for different types of dynamic user behaviors in recommender systems. While having realized the importance of modeling behavior-aware time-evolving user-item relationships in recommendation, some key challenges remain to be carefully tackled. Specifically, (1) How to explicitly preserve the dynamic behavior-specific semantics pertinent to each type of user-item 1



), convolution-based model Tang & Wang (2018) and attention mechanism Kang & McAuley (2018). More recent sequential recommender systems are built upon the Transformer Sun et al. (2019); Liu et al. (2021b) or Graph Neural Networks (GNNs) Wu et al. (2019); Ma et al. (2020); Wang et al. (2020c) to provide state-of-the-art recommendation performance. Despite their effectiveness, most of existing next-item recommendation approaches rely on only single type of user-item interaction (e.g., click or purchase data), and thus are limited to capture the item-level multi-behavior interaction patterns.

