首字母缩写标识和歧义共享的任务，以了解科学文档的理解

论文标题

首字母缩写标识和歧义共享的任务，以了解科学文档的理解

Acronym Identification and Disambiguation Shared Tasks for Scientific Document Understanding

论文作者

Veyseh, Amir Pouran Ben, Dernoncourt, Franck, Nguyen, Thien Huu, Chang, Walter, Celi, Leo Anthony

论文摘要

首字母缩写词是较长短语的简短形式，它们经常以书面形式使用，尤其是学术写作，以节省空间并促进信息的通信。因此，每个文本理解工具都应能够识别文本中的首字母缩写词（即首字母缩写标识），并找到其正确的含义（即首字母缩写歧义）。由于这些任务上的大多数先前工作都限于生物医学领域，并使用在有限数据集中训练的无监督方法或模型，因此它们无法很好地进行科学文档的理解。为了朝这个方向推动研究，我们分别组织了两项共同的任务，以分别在科学文档中的缩写标识和缩写歧义，分别为AI@sdu和ad@sdu。这两个共享任务分别吸引了52名和43位参与者。与现有基线相比，提交的系统取得了重大改进，但距人类水平的性能仍然很远。本文回顾了这两个共享任务以及每个共享的参与系统。

Acronyms are the short forms of longer phrases and they are frequently used in writing, especially scholarly writing, to save space and facilitate the communication of information. As such, every text understanding tool should be capable of recognizing acronyms in text (i.e., acronym identification) and also finding their correct meaning (i.e., acronym disambiguation). As most of the prior works on these tasks are restricted to the biomedical domain and use unsupervised methods or models trained on limited datasets, they fail to perform well for scientific document understanding. To push forward research in this direction, we have organized two shared task for acronym identification and acronym disambiguation in scientific documents, named AI@SDU and AD@SDU, respectively. The two shared tasks have attracted 52 and 43 participants, respectively. While the submitted systems make substantial improvements compared to the existing baselines, there are still far from the human-level performance. This paper reviews the two shared tasks and the prominent participating systems for each of them.

下载PDF全文

下载文献需遵守相关版权规定

论文标题