论文标题
通过无监督的公式标签来回答问题的挖掘数学文档
Mining Mathematical Documents for Question Answering via Unsupervised Formula Labeling
论文作者
论文摘要
越来越多的问题回答问题(QA)平台(例如Math Stack Exchange(MSE))表示越来越多的信息需要回答与数学相关的问题。但是,目前几乎没有关于开放数据质量质量质量质量检查系统的方法的研究,该方法使用其概念名称或查询公式标识符从知识图中检索数学公式。 In this paper, we aim to bridge the gap by presenting data mining methods and benchmark results to employ Mathematical Entity Linking (MathEL) and Unsupervised Formula Labeling (UFL) for semantic formula search and mathematical question answering (MathQA) on the arXiv preprint repository, Wikipedia, and Wikidata, which is part of the Wikimedia ecosystem of free knowledge.基于不同类型的信息需求,我们以15个信息需求模式评估我们的系统,评估超过7,000个查询结果。此外,我们将其性能与商业知识库和计算引擎(Wolfram Alpha)和搜索引擎(Google)进行了比较。开源系统由Wikimedia托管,网址为https://mathqa.wmflabs.org。 Demovideo可在purl.org/mathqa上找到。
The increasing number of questions on Question Answering (QA) platforms like Math Stack Exchange (MSE) signifies a growing information need to answer math-related questions. However, there is currently very little research on approaches for an open data QA system that retrieves mathematical formulae using their concept names or querying formula identifier relationships from knowledge graphs. In this paper, we aim to bridge the gap by presenting data mining methods and benchmark results to employ Mathematical Entity Linking (MathEL) and Unsupervised Formula Labeling (UFL) for semantic formula search and mathematical question answering (MathQA) on the arXiv preprint repository, Wikipedia, and Wikidata, which is part of the Wikimedia ecosystem of free knowledge. Based on different types of information needs, we evaluate our system in 15 information need modes, assessing over 7,000 query results. Furthermore, we compare its performance to a commercial knowledge-base and calculation-engine (Wolfram Alpha) and search-engine (Google). The open source system is hosted by Wikimedia at https://mathqa.wmflabs.org. A demovideo is available at purl.org/mathqa.