论文标题

迈向医学知识:用语义谓词作为知识单位和不确定性作为知识环境来代表和计算医学知识

Towards Medical Knowmetrics: Representing and Computing Medical Knowledge using Semantic Predications as the Knowledge Unit and the Uncertainty as the Knowledge Context

论文作者

Li, Xiaoying, Peng, Suyuan, Du, Jian

论文摘要

在中国,洪州教授和刘安教授是概念“知识单位”和“知识群体”的先驱,以衡量知识。但是,到目前为止,在不同领域,“可计算知识对象”的定义仍然存在争议。例如,它被定义为1)自然科学和工程学中的定量科学概念,2)教育研究领域的知识点,以及3)语义谓词,即生物医学领域中的主题 - 主体对象(SPO)三倍。语义MEDLINE数据库(SEMMedDB)是从医学文献中提取的SPO三元组的高质量公共存储库,为衡量医学知识提供了基本的数据基础架构。通常,从非结构化科学文本中提取SPO三倍作为可计算知识单元的研究一直在压倒性地关注科学知识本身。由于SPO三元组可能是从假设的,投机性的陈述甚至相互冲突和矛盾的断言中提取的,因此知识状态(即不确定性)是科学知识的组成部分和关键的一部分。本文的目的是将SPO三元组作为知识单元和不确定性作为知识环境提出医学知识的框架。肺癌出版物数据集用于验证拟议框架。医学知识的不确定性及其状态如何随着时间的流逝而间接地反映了竞争知识主张的优势,以及给定SPO三倍的确定性的可能性。我们尝试使用以不确定性为中心的方法讨论新见解,以检测研究方面,并确定具有高确定性水平的知识主张,以提高知识驱动的决策支持的功效。

In China, Prof. Hongzhou Zhao and Zeyuan Liu are the pioneers of the concept "knowledge unit" and "knowmetrics" for measuring knowledge. However, the definition of "computable knowledge object" remains controversial so far in different fields. For example, it is defined as 1) quantitative scientific concept in natural science and engineering, 2) knowledge point in the field of education research, and 3) semantic predications, i.e., Subject-Predicate-Object (SPO) triples in biomedical fields. The Semantic MEDLINE Database (SemMedDB), a high-quality public repository of SPO triples extracted from medical literature, provides a basic data infrastructure for measuring medical knowledge. In general, the study of extracting SPO triples as computable knowledge unit from unstructured scientific text has been overwhelmingly focusing on scientific knowledge per se. Since the SPO triples would be possibly extracted from hypothetical, speculative statements or even conflicting and contradictory assertions, the knowledge status (i.e., the uncertainty), which serves as an integral and critical part of scientific knowledge has been largely overlooked. This article aims to put forward a framework for Medical Knowmetrics using the SPO triples as the knowledge unit and the uncertainty as the knowledge context. The lung cancer publications dataset is used to validate the proposed framework. The uncertainty of medical knowledge and how its status evolves over time indirectly reflect the strength of competing knowledge claims, and the probability of certainty for a given SPO triple. We try to discuss the new insights using the uncertainty-centric approaches to detect research fronts, and identify knowledge claims with high certainty level, in order to improve the efficacy of knowledge-driven decision support.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源