论文标题
Rocobert:基于文本建议的目录语言模型
RecoBERT: A Catalog Language Model for Text-Based Recommendations
论文作者
论文摘要
使用未标记的文本的广泛自我监管的预训练的语言模型最近已证明可以显着提高各种语言理解任务的最新性能。但是,目前尚不清楚如何以及如何利用这些模型来进行基于文本的建议。在这项工作中,我们介绍了Rocobert,这是一种基于BERT的方法,用于学习基于文本项目建议的目录专题化语言模型。我们建议新的培训和推理程序,以分别在成对的项目之间得分相似,而这些项目不需要项目相似标签。训练和推理技术均设计用于利用文本目录的未标记结构,并最大程度地减少它们之间的差异。通过在推理期间合并四个分数,reo录餐可以比其他技术更准确地推断基于文本的项目到项目的相似性。此外,我们使用基于专业葡萄酒评论的相似性介绍了一项新的语言理解任务,以针对葡萄酒推荐。作为另一个贡献,我们发布了人类葡萄酒专家精心制作的注释建议数据集。最后,我们评估了Recobert,并将其与葡萄酒和时尚建议任务上的各种最先进的NLP模型进行比较。
Language models that utilize extensive self-supervised pre-training from unlabeled text, have recently shown to significantly advance the state-of-the-art performance in a variety of language understanding tasks. However, it is yet unclear if and how these recent models can be harnessed for conducting text-based recommendations. In this work, we introduce RecoBERT, a BERT-based approach for learning catalog-specialized language models for text-based item recommendations. We suggest novel training and inference procedures for scoring similarities between pairs of items, that don't require item similarity labels. Both the training and the inference techniques were designed to utilize the unlabeled structure of textual catalogs, and minimize the discrepancy between them. By incorporating four scores during inference, RecoBERT can infer text-based item-to-item similarities more accurately than other techniques. In addition, we introduce a new language understanding task for wine recommendations using similarities based on professional wine reviews. As an additional contribution, we publish annotated recommendations dataset crafted by human wine experts. Finally, we evaluate RecoBERT and compare it to various state-of-the-art NLP models on wine and fashion recommendations tasks.