论文标题

整体:带无碰撞嵌入桌的实时推荐系统

Monolith: Real Time Recommendation System With Collisionless Embedding Table

论文作者

Liu, Zhuoran, Zou, Leqi, Zou, Xuan, Wang, Caihua, Zhang, Biao, Tang, Da, Zhu, Bolin, Zhu, Yijie, Wu, Peng, Wang, Ke, Cheng, Youlong

论文摘要

建立可扩展的实时推荐系统对于许多由时间敏感的客户反馈(例如短视频排名或在线广告)驱动的企业至关重要。尽管生产规模的深度学习框架(如张量或Pytorch)无处不在,但由于各种原因,这些通用用途框架在推荐方案中的业务需求不足:一方面:一方面,基于静态参数的调整系统,以及通过静态参数进行调整,以进行静态和稀疏功能的建议,并具有动态和稀疏功能,这是对模型质量的限制;另一方面,这样的框架是通过批处理训练阶段和服务阶段完全分开的,从而阻止了模型实时与客户反馈相互作用。这些问题导致我们重新检查传统方法并探索根本不同的设计选择。在本文中,我们介绍了Monolith,该系统是针对在线培训的系统。我们的设计是由对应用程序工作负载和生产环境的观察驱动的,这反映了与其他建议系统明显不同。我们的贡献是多种多样的:首先,我们制作了一张无碰撞的嵌入桌,具有优化的嵌入和频率过滤等优化,以减少其内存足迹;其次,我们提供了具有高耐受性的在线培训架构;最后,我们证明可以将系统可靠性用于实时学习。 Monolith已成功地降落在Byteplus推荐产品中。

Building a scalable and real-time recommendation system is vital for many businesses driven by time-sensitive customer feedback, such as short-videos ranking or online ads. Despite the ubiquitous adoption of production-scale deep learning frameworks like TensorFlow or PyTorch, these general-purpose frameworks fall short of business demands in recommendation scenarios for various reasons: on one hand, tweaking systems based on static parameters and dense computations for recommendation with dynamic and sparse features is detrimental to model quality; on the other hand, such frameworks are designed with batch-training stage and serving stage completely separated, preventing the model from interacting with customer feedback in real-time. These issues led us to reexamine traditional approaches and explore radically different design choices. In this paper, we present Monolith, a system tailored for online training. Our design has been driven by observations of our application workloads and production environment that reflects a marked departure from other recommendations systems. Our contributions are manifold: first, we crafted a collisionless embedding table with optimizations such as expirable embeddings and frequency filtering to reduce its memory footprint; second, we provide an production-ready online training architecture with high fault-tolerance; finally, we proved that system reliability could be traded-off for real-time learning. Monolith has successfully landed in the BytePlus Recommend product.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源