使用多任务学习确定众包数据集中的问题 - 答案合理性

论文标题

使用多任务学习确定众包数据集中的问题 - 答案合理性

Determining Question-Answer Plausibility in Crowdsourced Datasets Using Multi-Task Learning

论文作者

Gardner, Rachel, Varma, Maya, Zhu, Clare, Krishna, Ranjay

论文摘要

从社交网络和在线论坛中提取的数据集通常容易出现自然语言的陷阱，即存在非结构化和嘈杂的数据。在这项工作中，我们试图通过提出一项新颖的任务来自动质量分析和数据清洁，从社交媒体中收集高质量的提问数据集：问题 - 答案（QA）的合理性。考虑到机器或用户生成的问题以及社交媒体用户的众包回答，我们确定问题和回答是否有效；如果是这样，我们在自由形式响应中确定答案。我们设计了基于BERT的模型来执行质量检查的合理性任务，并评估了模型生成干净，可用的提问数据集的能力。我们表现最高的方法由一个单任务模型组成，该模型决定了问题的合理性，其次是多任务模型，该模型评估了响应的合理性以及提取答案（问题Proloc = 0.75，响应Promibal = 0.78，答案提取f1 = 0.665）。

Datasets extracted from social networks and online forums are often prone to the pitfalls of natural language, namely the presence of unstructured and noisy data. In this work, we seek to enable the collection of high-quality question-answer datasets from social media by proposing a novel task for automated quality analysis and data cleaning: question-answer (QA) plausibility. Given a machine or user-generated question and a crowd-sourced response from a social media user, we determine if the question and response are valid; if so, we identify the answer within the free-form response. We design BERT-based models to perform the QA plausibility task, and we evaluate the ability of our models to generate a clean, usable question-answer dataset. Our highest-performing approach consists of a single-task model which determines the plausibility of the question, followed by a multi-task model which evaluates the plausibility of the response as well as extracts answers (Question Plausibility AUROC=0.75, Response Plausibility AUROC=0.78, Answer Extraction F1=0.665).

下载PDF全文

下载文献需遵守相关版权规定

论文标题