论文标题
扭矩:时间订购问题的阅读理解数据集
TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions
论文作者
论文摘要
阅读的关键部分是能够理解文本段落中描述的事件之间的时间关系,即使没有明确说明这些关系。但是,当前的机器阅读理解基准实际上没有测试时间现象的问题,因此在这些基准测试基准上培训的系统没有能力回答诸如“ [某些事件]之前/之后发生了什么?”我们介绍了扭矩,这是一种新的英语阅读理解基准,建立在3.2k新闻片段上,并带有21k人类生成的问题查询时间关系。结果表明,罗伯塔·普莱格(Roberta-Large)在测试扭矩集中获得了51%的精确匹配分数,落后于人类绩效的30%。
A critical part of reading is being able to understand the temporal relationships between events described in a passage of text, even when those relationships are not explicitly stated. However, current machine reading comprehension benchmarks have practically no questions that test temporal phenomena, so systems trained on these benchmarks have no capacity to answer questions such as "what happened before/after [some event]?" We introduce TORQUE, a new English reading comprehension benchmark built on 3.2k news snippets with 21k human-generated questions querying temporal relationships. Results show that RoBERTa-large achieves an exact-match score of 51% on the test set of TORQUE, about 30% behind human performance.