论文标题
detext:带有伯特的深文排名框架
DeText: A Deep Text Ranking Framework with BERT
论文作者
论文摘要
排名是搜索系统中最重要的组件。大多数搜索系统处理大量自然语言数据,因此有效的排名系统需要深入了解文本语义。最近,基于深度学习的自然语言过程(Deep NLP)模型产生了令人鼓舞的结果。 Bert是最成功的模型之一,它已应用于Capturecomplex查询文档关系的搜索排名。但是,这通常是通过与每个文档单词进行详尽的互动来完成的,这对于在线服务搜索产品系统效率低下。在本文中,我们调查了如何为行业用例提供高效的基于BERT的排名模型。该解决方案进一步扩展到了开源的一般排名框架,Dredext,该框架是开源的,可以应用于各种排名生产。第三次搜索系统上的DICTEXT的离线和在线实验提出了明显的改进方法。
Ranking is the most important component in a search system. Mostsearch systems deal with large amounts of natural language data,hence an effective ranking system requires a deep understandingof text semantics. Recently, deep learning based natural languageprocessing (deep NLP) models have generated promising results onranking systems. BERT is one of the most successful models thatlearn contextual embedding, which has been applied to capturecomplex query-document relations for search ranking. However,this is generally done by exhaustively interacting each query wordwith each document word, which is inefficient for online servingin search product systems. In this paper, we investigate how tobuild an efficient BERT-based ranking model for industry use cases.The solution is further extended to a general ranking framework,DeText, that is open sourced and can be applied to various rankingproductions. Offline and online experiments of DeText on threereal-world search systems present significant improvement overstate-of-the-art approaches.