论文标题

在线贝叶斯推荐不后悔

Online Bayesian Recommendation with No Regret

论文作者

Feng, Yiding, Tang, Wei, Xu, Haifeng

论文摘要

我们介绍和研究平台的在线贝叶斯推荐问题,他们可以观察到与公用事业相关的产品状态,并通过在线推荐机制反复与近视用户人群进行互动。在当前互联网经济的各种情况下,这种范式很常见。对于每个用户具有自己的私人喜好和信念的用户,该平台致力于提出建议策略,以利用他在产品状态上的信息优势来说服自私的用户遵循建议。该平台不知道用户的偏好和信念,并且必须使用自适应推荐策略来说服逐渐学习用户在此过程中的偏好和信念。 我们的目标是设计在线学习政策,而没有对该平台的遗憾,即,在用户将其行为适应基准策略的假设下,反对事后的最佳政策。我们的第一个结果是在线政策,该政策实现了对回合数量的双重对数后悔的依赖。然后,我们提出一个硬度结果,表明没有自适应在线政策可以更好地依赖回合的数量后悔。最后,通过将平台的问题通过成员资格Oracle访问优化线性程序,我们提出了第二种在线政策,以对多项式依赖状态的数量感到遗憾,但对数依赖于回合的数量。

We introduce and study the online Bayesian recommendation problem for a platform, who can observe a utility-relevant state of a product, repeatedly interacting with a population of myopic users through an online recommendation mechanism. This paradigm is common in a wide range of scenarios in the current Internet economy. For each user with her own private preference and belief, the platform commits to a recommendation strategy to utilize his information advantage on the product state to persuade the self-interested user to follow the recommendation. The platform does not know user's preferences and beliefs, and has to use an adaptive recommendation strategy to persuade with gradually learning user's preferences and beliefs in the process. We aim to design online learning policies with no Stackelberg regret for the platform, i.e., against the optimum policy in hindsight under the assumption that users will correspondingly adapt their behaviors to the benchmark policy. Our first result is an online policy that achieves double logarithm regret dependence on the number of rounds. We then present a hardness result showing that no adaptive online policy can achieve regret with better dependency on the number of rounds. Finally, by formulating the platform's problem as optimizing a linear program with membership oracle access, we present our second online policy that achieves regret with polynomial dependence on the number of states but logarithm dependence on the number of rounds.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源