论文标题

使用概念网知识基础的数据驱动研究的数据驱动研究

A Data-Driven Study of Commonsense Knowledge using the ConceptNet Knowledge Base

论文作者

Shen, Ke, Kejriwal, Mayank

论文摘要

获得常识性知识和推理被认为是实现通用人工智能(AI)的重要边界。自然语言处理(NLP)社区的最新研究在此问题设置中表现出了重大进展。尽管取得了这种进步,这主要是在有限的设置中回答任务的多项选择问题,但仍然缺乏对常识性知识本身本质本质的理解(尤其是在大规模上)。在本文中,我们提出并进行了一项系统的研究,以通过对概念网知识基础进行经验和结构分析来深入了解常识知识。 ConceptNet是一个免费的知识库,其中包含以自然语言提出的数百万个常识性断言。使用最先进的无监督图表示学习(“嵌入”)和聚类技术的三个精心设计的研究问题的详细实验结果揭示了概念网络关系中的深层子结构,从而使我们能够在传统上对“现象”的含义(例如,仅在定性的角度上进行了传统上讨论的现象的含义)。此外,我们的方法论提供了一个案例研究,讲述了如何使用数据科学和计算方法来理解日常(但复杂)心理现象的性质,这是人类智能的基本特征。

Acquiring commonsense knowledge and reasoning is recognized as an important frontier in achieving general Artificial Intelligence (AI). Recent research in the Natural Language Processing (NLP) community has demonstrated significant progress in this problem setting. Despite this progress, which is mainly on multiple-choice question answering tasks in limited settings, there is still a lack of understanding (especially at scale) of the nature of commonsense knowledge itself. In this paper, we propose and conduct a systematic study to enable a deeper understanding of commonsense knowledge by doing an empirical and structural analysis of the ConceptNet knowledge base. ConceptNet is a freely available knowledge base containing millions of commonsense assertions presented in natural language. Detailed experimental results on three carefully designed research questions, using state-of-the-art unsupervised graph representation learning ('embedding') and clustering techniques, reveal deep substructures in ConceptNet relations, allowing us to make data-driven and computational claims about the meaning of phenomena such as 'context' that are traditionally discussed only in qualitative terms. Furthermore, our methodology provides a case study in how to use data-science and computational methodologies for understanding the nature of an everyday (yet complex) psychological phenomenon that is an essential feature of human intelligence.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源