论文标题
从社会科学学术论文中提取因果知识
Causal Knowledge Extraction from Scholarly Papers in Social Sciences
论文作者
论文摘要
当今学术文章的规模和范围是试图及时消化和综合知识的压倒性人类研究人员。在本文中,我们试图开发自然语言处理(NLP)模型,以加快从社会科学中的学术论文中提取关系的速度,从这些论文中识别假设,并提取因果关系实体。具体而言,我们为1)将商业和管理中的学术文献中的句子分类为假设(假设分类),2)将这些假设分类为因果关系(因果关系分类),并将其分类为因果关系,如果是因果关系,3)从这些假设中提取原因和效应实体(实体提取)。我们使用不同的建模技术实现了所有三个任务的高性能。我们的方法可能可以推广到广泛的社会科学以及其他类型的文本材料中的学术文件。
The scale and scope of scholarly articles today are overwhelming human researchers who seek to timely digest and synthesize knowledge. In this paper, we seek to develop natural language processing (NLP) models to accelerate the speed of extraction of relationships from scholarly papers in social sciences, identify hypotheses from these papers, and extract the cause-and-effect entities. Specifically, we develop models to 1) classify sentences in scholarly documents in business and management as hypotheses (hypothesis classification), 2) classify these hypotheses as causal relationships or not (causality classification), and, if they are causal, 3) extract the cause and effect entities from these hypotheses (entity extraction). We have achieved high performance for all the three tasks using different modeling techniques. Our approach may be generalizable to scholarly documents in a wide range of social sciences, as well as other types of textual materials.