IEDR: A CONTEXT-AWARE INTRINSIC AND EXTRIN-SIC DISENTANGLED RECOMMENDER SYSTEM

Abstract

Intrinsic and extrinsic factors jointly affect users' decisions in item selection (e.g., click, purchase). Intrinsic factors reveal users' real interests and are invariant in different contexts (e.g., time, weather), whereas extrinsic factors can change w.r.t. different contexts. Analyzing these two factors is an essential yet challenging task in recommender systems. However, in existing studies, factor analysis is either largely neglected, or designed for a specific context (e.g., the time context in sequential recommendation), which limits the applicability of such models. In this paper, we propose a generic model, IEDR, to learn intrinsic and extrinsic factors from various contexts for recommendation. IEDR contains two key components: a contrastive learning component, and a disentangling component. The two components collaboratively enable our model to learn context-invariant intrinsic factors and context-based extrinsic factors from all available contexts. Experimental results on real-world datasets demonstrate the effectiveness of our model in factor learning and impart a significant improvement in recommendation accuracy over the state-of-the-art methods.

1. INTRODUCTION

Recommender systems aim to predict the probability of a user selecting a given item (e.g., click, purchase). This is a challenging prediction as each decision is jointly affected by multiple factors (Ma et al., 2019) . Psychological research has revealed that users' decision making is mainly influenced by two factors: intrinsic and extrinsic factors (Bénabou & Tirole, 2003; Vallerand, 1997) . An intrinsic factor is an internal motivation for inherent satisfaction, which is often stable for an individual. In contrast, an extrinsic factor is a contextual motivation triggered by the environment (external stimulation), and it often varies among different contexts (e.g., weather, time) (Ryan & Deci, 2000) . For example, on a day with heavy rain, a user decides to take an Uber (a taxi calling app) to work. In this case, the choice of Uber over other taxi calling apps is because the user is more comfortable with this app's user interface (intrinsic factor), while the choice of taking a ride to work is motivated by the weather condition (extrinsic factor). Although the importance of capturing these factors in recommender systems has been recognized, their full potential has not been explored by the existing works. (1) Some studies neglect the intrinsic and extrinsic factor disentangling, and the final prediction mainly relies on learning entangled representations (Barkan & Koenigstein, 2016; Covington et al., 2016; Wu et al., 2019) . With the intrinsic and extrinsic factors entangled behind each decision, the real factors that derive the decision may be incorrectly inferred, resulting in a suboptimal recommendation Wang et al. (2020) . (2) Some studies learn intrinsic and extrinsic factors, but just under a specific context. For example, some sequential recommendation models leverage the time context (order sequence) to learn intrinsic and extrinsic factors (they call them long-and short-term interests) (Hidasi et al., 2016; Yu et al., 2019b) ; some point-of-interest recommendation models leverage the spatial context (geometric distance) to learn the two factors (Li et al., 2017; Wu et al., 2020) . In such models, the factor learning approaches are domain-specific, so it would be difficult to generalize them to other contexts. Meanwhile, the factors may be influenced by multiple contexts. Hence, focusing on a single context may result in inferior factor learning. Therefore, it is still an open question of how to effectively incorporate various context information for learning intrinsic and extrinsic factors in recommender systems. Focusing on this question, we propose a generic recommendation framework that can learn intrinsic and extrinsic factors from various contexts. We first formally define context-agnostic intrinsic and extrinsic factors for recommendation tasks. Following these definitions, we propose an Intrinsic-Extrinsic Disentangled Recommendation (IEDR) model, which contains two modules: a recommendation prediction (RP) module, and a contrastive intrinsic-extrinsic disentangling (CIED) module. For each user-item interaction, the PR module constructs all the contexts as a graph (context graph), and the context representation is obtained via learning the graph. The same procedures are done to obtain the user and item representations from their attributes (e.g., user gender, item category), respectively. Then, the intrinsic and extrinsic factors are learned from these representations for the user and the item perspectives. Meanwhile, the CIED module contains two components: a contrastive learning component that learns a context-invariant intrinsic factor, and a disentangling component that disentangles the intrinsic and extrinsic factors via mutual information minimization. The two components jointly ensure IEDR to learn intrinsic and extrinsic factors. In this paper, we made the following contributions: 

2. RELATED WORK

This section summarizes the current research progress on recommender systems and contrastive learning related to our work. Feature interaction modeling Many recommender systems leverage feature interactions to improve prediction accuracy. One of the most common techniques is the factorization machine (FM) (Rendle, 2010), which models feature interactions through dot product and achieves great success. Recent studies extend FM with deep neural networks for more powerful feature interaction modeling (Xiao et al., 2017; He & Chua, 2017; Yu et al., 2019a) 2021) leverage the relation reasoning power of graph neural networks for feature interaction modeling. However, these models do not incorporate context information for better factor analysis, and we overcome this issue by leveraging this information to disentangle and learn intrinsic and extrinsic factors for recommendation. Factor disentanglement Intrinsic and extrinsic factors are considered as two basic factors for individual decision making in psychological research (Ryan & Deci, 2000; Bénabou & Tirole, 2003; Vallerand, 1997) . Recent recommender systems have borrowed the idea of capturing these two factors to achieve more accurate recommendation. For example, in sequential recommendation, Hidasi et al. (2016) are the first to leverage the recurrent neural networks to capture users' longand short-term (LS-term) interests from their interacted item sequences. Yu et al. (2019b) propose a time-aware controller to capture the differences between LS-term interests for more accurate interest learning. Zheng et al. (2022) further emphasize the disentanglement between the LS-term interests at different time scales to differentiate the LS-term interests. In point-of-interest recommendation, studies are leveraging spatial context to capture the intrinsic and extrinsic factors (Li et al., 2017; Wu et al., 2020) . However, all of the above studies focus on specific contexts. As a result, their factor learning approaches are hard to apply to other recommendation domains, which may result in a suboptimal solution if other contexts jointly influence these factors. Some studies learn users' multiple factors without knowing the meaning of each factor (i.e., implicit factor). They first define the number of factors (e.g., 4) to be learned, and then disentangle the representations of each pair of factors (Ma et al., 2019; Wang et al., 2020) . These models only ensure that the learned factor



To better analyze the factors influencing users' decisions, we formalize the context-agnostic intrinsic and extrinsic factors for recommender systems. Following these definitions, we propose IEDR to learn disentangled intrinsic and extrinsic factors from various contexts for recommendation. • IEDR comprises a context-invariant contrastive learning component, and a mutual information minimization-based disentangling component to effectively disentangle the learned factors. • Extensive experiments on real-world datasets show that (1) IEDR significantly outperforms the state-of-the-art baselines when various contexts are available; (2) our proposed CIED module can successfully learn intrinsic and extrinsic factors.

. The Wide & Deep model (WDL) (Cheng et al., 2016) proposes a framework that combines shallow and deep modeling of features for recommendation. Guo et al. (2017) combine FM and WDL by replacing the shallow part of WDL with an FM model. Su et al. (

