论文标题
总结基于社区的问答对
Summarizing Community-based Question-Answer Pairs
论文作者
论文摘要
允许用户获取所需信息的基于社区的问题答案(CQA)已越来越成为电子商务,旅行和用餐等各个领域的在线服务的重要组成部分。但是,大量的CQA对使用户难以在没有特别意图找到分布在CQA对上的有用信息。为了帮助用户快速消化关键信息,我们提出了新颖的CQA摘要任务,该任务旨在从CQA Pairs创建简洁的摘要。为此,我们首先设计一个多阶段数据注释过程,并基于Amazon QA语料库创建基准数据集Coqasum。然后,我们比较了提取性和抽象性摘要方法的集合,并建立了针对CQA摘要任务的强大基线方法。我们的实验进一步证实了针对CQA摘要任务的两个关键挑战:句子类型的转移和删除删除。我们的数据和代码公开可用。
Community-based Question Answering (CQA), which allows users to acquire their desired information, has increasingly become an essential component of online services in various domains such as E-commerce, travel, and dining. However, an overwhelming number of CQA pairs makes it difficult for users without particular intent to find useful information spread over CQA pairs. To help users quickly digest the key information, we propose the novel CQA summarization task that aims to create a concise summary from CQA pairs. To this end, we first design a multi-stage data annotation process and create a benchmark dataset, CoQASUM, based on the Amazon QA corpus. We then compare a collection of extractive and abstractive summarization methods and establish a strong baseline approach DedupLED for the CQA summarization task. Our experiment further confirms two key challenges, sentence-type transfer and deduplication removal, towards the CQA summarization task. Our data and code are publicly available.