社交形式：社交网络灵感的文档排名的长文档建模

论文标题

社交形式：社交网络灵感的文档排名的长文档建模

Socialformer: Social Network Inspired Long Document Modeling for Document Ranking

论文作者

Zhou, Yujia, Dou, Zhicheng, Yuan, Huaying, Ma, Zhengyi

论文摘要

利用预训练的语言模型在神经文档排名方面取得了巨大的成功。受计算和内存要求的限制，长文档建模成为一个关键问题。最近的作品建议通过设计稀疏的注意力模式来修改变压器中的全部注意力矩阵。但是，其中大多数仅专注于固定尺寸窗口中的术语本地连接。如何在术语到更好的模型文档表示之间建立合适的远程连接尚未得到充实。在本文中，我们提出了模型社交形式，该模型将社交网络的特征介绍为设计文档排名中长期文档建模的稀疏注意模式。具体来说，我们考虑了几种关注模式来构建像社交网络之类的图形。具有社交网络的特征，在这样的图中，大多数节点都可以通过短路到达，同时确保稀疏性。为了促进有效的计算，我们将图表分为多个子图，以模拟社交场景中的朋友圈子。实验结果证实了我们模型对长文档建模的有效性。

Utilizing pre-trained language models has achieved great success for neural document ranking. Limited by the computational and memory requirements, long document modeling becomes a critical issue. Recent works propose to modify the full attention matrix in Transformer by designing sparse attention patterns. However, most of them only focus on local connections of terms within a fixed-size window. How to build suitable remote connections between terms to better model document representation remains underexplored. In this paper, we propose the model Socialformer, which introduces the characteristics of social networks into designing sparse attention patterns for long document modeling in document ranking. Specifically, we consider several attention patterns to construct a graph like social networks. Endowed with the characteristic of social networks, most pairs of nodes in such a graph can reach with a short path while ensuring the sparsity. To facilitate efficient calculation, we segment the graph into multiple subgraphs to simulate friend circles in social scenarios. Experimental results confirm the effectiveness of our model on long document modeling.

下载PDF全文

下载文献需遵守相关版权规定

论文标题