论文标题
RUBQ:俄罗斯数据集,以回答Wikidata
RuBQ: A Russian Dataset for Question Answering over Wikidata
论文作者
论文摘要
该论文介绍了Rubq,这是第一个俄罗斯知识基础问题回答(KBQA)数据集。高质量的数据集由1,500个俄罗斯问题组成,其复杂性不同,他们的英语机器翻译,sparql查询到Wikidata,参考答案以及包含俄罗斯标签的实体的Wikidata样本。数据集创建始于在线测验中大量的提问对。数据进行了自动过滤,人群辅助实体链接,自动生成SPARQL查询及其后续内部验证。
The paper presents RuBQ, the first Russian knowledge base question answering (KBQA) dataset. The high-quality dataset consists of 1,500 Russian questions of varying complexity, their English machine translations, SPARQL queries to Wikidata, reference answers, as well as a Wikidata sample of triples containing entities with Russian labels. The dataset creation started with a large collection of question-answer pairs from online quizzes. The data underwent automatic filtering, crowd-assisted entity linking, automatic generation of SPARQL queries, and their subsequent in-house verification.