论文标题
跨语性文档检索和平稳学习
Cross-Lingual Document Retrieval with Smooth Learning
论文作者
论文摘要
跨语性文档搜索是一个信息检索任务,其中查询语言与文档语言的不同。在本文中,我们研究了神经文档搜索模型的不稳定性,并提出了一个新颖的端到端稳健框架,该框架在使用不同文档的语言的跨语性搜索中提高了性能。该框架包括对相关性,平滑余弦相似性的新颖衡量,查询和文档之间的相似性以及新颖的损失函数,平滑的序数搜索损失,作为目标。我们进一步提供了针对拟议框架的概括误差的理论保证。我们进行实验以将我们的方法与其他文档搜索模型进行比较,并在跨语性文档检索任务中以各种语言的方式观察到大量使用的排名指标。
Cross-lingual document search is an information retrieval task in which the queries' language differs from the documents' language. In this paper, we study the instability of neural document search models and propose a novel end-to-end robust framework that achieves improved performance in cross-lingual search with different documents' languages. This framework includes a novel measure of the relevance, smooth cosine similarity, between queries and documents, and a novel loss function, Smooth Ordinal Search Loss, as the objective. We further provide theoretical guarantee on the generalization error bound for the proposed framework. We conduct experiments to compare our approach with other document search models, and observe significant gains under commonly used ranking metrics on the cross-lingual document retrieval task in a variety of languages.