有偏见的Textrank：无监督的基于图的内容提取

论文标题

有偏见的Textrank：无监督的基于图的内容提取

Biased TextRank: Unsupervised Graph-Based Content Extraction

论文作者

Kazemi, Ashkan, Pérez-Rosas, Verónica, Mihalcea, Rada

论文摘要

我们介绍了有偏见的Textrank，这是一种基于图形的内容提取方法，其灵感来自流行的Textrank算法，该方法根据其对语言处理任务的重要性进行排名，并根据其与输入“ Focus”的相关性。有偏见的Textrank通过修改Textrank执行中的随机重新启动，可以为文本进行集中的内容提取。随机重新启动概率是根据图节节点与任务重点的相关性分配的。我们提出了有偏见的Textrank的两种应用：集中的摘要和解释提取，并表明我们的算法通过大量的Rouge-N得分率提高了两个不同数据集的性能。与其前身一样，有偏见的Textrank是无监督的，易于实现的，并且比目前针对类似任务的当前最新自然语言处理方法快，更轻。

We introduce Biased TextRank, a graph-based content extraction method inspired by the popular TextRank algorithm that ranks text spans according to their importance for language processing tasks and according to their relevance to an input "focus." Biased TextRank enables focused content extraction for text by modifying the random restarts in the execution of TextRank. The random restart probabilities are assigned based on the relevance of the graph nodes to the focus of the task. We present two applications of Biased TextRank: focused summarization and explanation extraction, and show that our algorithm leads to improved performance on two different datasets by significant ROUGE-N score margins. Much like its predecessor, Biased TextRank is unsupervised, easy to implement and orders of magnitude faster and lighter than current state-of-the-art Natural Language Processing methods for similar tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题