论文标题
布朗大学在TREC深学习2019
Brown University at TREC Deep Learning 2019
论文作者
论文摘要
本文描述了布朗大学对TREC 2019深度学习曲目的提交。我们遵循了一种2阶段的方法来生成给定输入查询的段落排名:在第一阶段,通过附加3个由变压器模型生成的查询来扩展用户的查询,该查询经过训练,该查询经过训练以将输入查询复制到语义上相似的查询中。扩展的查询可以表现出更大的表面形式和词汇与感兴趣通道的重叠,因此可以作为对任何下游信息检索方法的丰富输入。在第二阶段,我们使用基于BERT的模型进行语言建模,但进行了微调,以进行查询 - 文档相关性预测,以计算每个查询的一组1000个候选段落的相关性分数,然后根据预测的相关性得分对其进行排序。根据TREC Deep Leaver Track 2019的官方概述的结果,我们的团队在通道检索任务(包括全排名和重新排名)中排名第三,而仅考虑重新排列的提交时,排名第2。
This paper describes Brown University's submission to the TREC 2019 Deep Learning track. We followed a 2-phase method for producing a ranking of passages for a given input query: In the the first phase, the user's query is expanded by appending 3 queries generated by a transformer model which was trained to rephrase an input query into semantically similar queries. The expanded query can exhibit greater similarity in surface form and vocabulary overlap with the passages of interest and can therefore serve as enriched input to any downstream information retrieval method. In the second phase, we use a BERT-based model pre-trained for language modeling but fine-tuned for query - document relevance prediction to compute relevance scores for a set of 1000 candidate passages per query and subsequently obtain a ranking of passages by sorting them based on the predicted relevance scores. According to the results published in the official Overview of the TREC Deep Learning Track 2019, our team ranked 3rd in the passage retrieval task (including full ranking and re-ranking), and 2nd when considering only re-ranking submissions.