JDREC：在线组合推荐系统的实用参与者批评框架

论文标题

JDREC：在线组合推荐系统的实用参与者批评框架

JDRec: Practical Actor-Critic Framework for Online Combinatorial Recommender System

论文作者

Zhao, Xin, Fang, Zhiwei, Guo, Yuchen, He, Jie, Chen, Wenlong, Peng, Changping

论文摘要

组合推荐器（CR）系统一次在结果页面中一次将项目列表馈送给用户，其中用户行为受上下文信息和项目的影响。 CR被称为组合优化问题，目的是最大程度地提高整个列表的建议奖励。尽管它很重要，但由于在线环境中的效率，动态和个性化要求，建立实用的CR系统仍然是一个挑战。特别是，我们将问题分为两个子问题，即列表生成和列表评估。新颖和实用的模型体系结构是为这些子问题设计的，旨在共同优化有效性和效率。为了适应在线案例，提供了形成参与者批判性增强框架的自举算法，以探索在长期用户互动中更好的推荐模式。离线和在线实验结果证明了拟议的JDREC框架的功效。 JDREC已应用于在线JD建议中，将点击率提高了2.6％，平台的合成价值提高了5.03％。我们将发布本研究中使用的大规模数据集，以为研究社区做出贡献。

A combinatorial recommender (CR) system feeds a list of items to a user at a time in the result page, in which the user behavior is affected by both contextual information and items. The CR is formulated as a combinatorial optimization problem with the objective of maximizing the recommendation reward of the whole list. Despite its importance, it is still a challenge to build a practical CR system, due to the efficiency, dynamics, personalization requirement in online environment. In particular, we tear the problem into two sub-problems, list generation and list evaluation. Novel and practical model architectures are designed for these sub-problems aiming at jointly optimizing effectiveness and efficiency. In order to adapt to online case, a bootstrap algorithm forming an actor-critic reinforcement framework is given to explore better recommendation mode in long-term user interaction. Offline and online experiment results demonstrate the efficacy of proposed JDRec framework. JDRec has been applied in online JD recommendation, improving click through rate by 2.6% and synthetical value for the platform by 5.03%. We will publish the large-scale dataset used in this study to contribute to the research community.

下载PDF全文

下载文献需遵守相关版权规定

论文标题